Programming the C/C++ Abstract Machine with a NOP

What was the first computer program you wrote?

For me, it was generating the first 100 prime numbers using the Sieve of
Eratosthenes. I had preceded that with many failed attempts at a problem my instructor had given me – parse a string of characters for valid decimal numbers. I couldn’t understand the purpose of the problem, or how to go about it. It all seemed silly and superfluous. But generating prime numbers was what did it for me. I understood the problem well enough to bother only about getting it work on my computer. It was a moment of epiphany when I realized that all I need to do to solve a problem of my interest is to carefully think about a set of instructions for the machine and the machine will faithfully, tirelessly follow them to get me an answer. This happened in 1993. I was not a teenager- I came late to computer programming – I had just finished my Bachelor’s in physics. But the delight of seeing those numbers getting printed on the screen resonated in my soul for days ahead. I was smiling and giggling with myself. All I could think of was about the next program I should write for that machine. Over the years, it has been rare to match that joy of my programming dawn.

I wrote my first program in K&R C. And I became a reluctant learner of new programming languages and dialects. Around me, I saw my friends wax eloquent about this awesome language or that nifty feature – but for me, it was always a question of show me a problem that you can do in the new language/dialect and I cannot do in C!! As a result, I stayed monotheistic in my programming practice for a very long time. And later, even when I stumbled into the deviations of C++ and Perl, I was still a programmer of the C DNA – Literals, Variables, Operators, Statements, For/While Loop, and the If/Else/Switch – these I cherished as valuable and worth knowing. The rest were mere details…..

What I didn’t realize then, and appreciated much later, was that the C Language describes an abstract (Von Neumann) machine which was very close to a real physical computer of 1970’s. Incredibly, it still helped my intuition as I wrote my programs in 1990’s. In other words, the language put very little in between the programmer and the real physical machine. Much water has flown through river Ganga since, and many new languages have become flavor of the seasons, but I find this aspect of the C Abstact Machine Model amazing!!

Now the variety of abstract machines for programming languages is an interesting subject in itself and I found this paper [1] very informative. However its description of C/C++ is very terse – everything for popular imperative languages is binned under the Algol60 family. On the other hand, the entire C++11 specification  [2] claims to be the description of an abstract machine for the language. Here is the quote from section 1.9 (the as-if rule) –

The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine …

an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance, an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the observable behavior of the program are produced

That’s quite a mouthful and best avoided if you want to stay sane. Instead let me describe the abstract machine in simple, practical terms for a C/C++ programmer. The machine consists of a CPU (or better let’s call it the C++ALU) which executes program statements, atomically, serially one at a time. The C++ALU interacts with a Memory for operations on objects. Objects are entities living in Memory and these operations are for creation, manipulation, and destruction of objects. There are facilities for IO for the CPU through a separate subsystem. One can examine the machine state (defined by objects in memory) before and after every statement execution.

If you are familiar with modern computer architecture, you’ll quickly realize that this is a ridiculous machine compared to a real x86 platform on which we program. For example – there is no notion of memory/cache hierarchy, or machine registers, or machine operations, or out-of-order instruction pipeline, or multiple instruction issue etc. Also no virtual memory, system calls, traps, privilege modes etc. But even then, a beginning C/C++ programmer will benefit if she behaved as if she wrote her programs for the abstract machine.

For instance, let’s write the simplest valid C++11 program (first.cpp) – what I call the C++ALU’s NOP instruction. A program which does nothing. Here it is :

int main() {}

In all the examples, I use g++ 4.8.1 on a x86_64 Cygwin Machine and set the compiler options to -Wall -Werror -std=c++11. It comes as a surprise to many that main need not return a value, but the program is valid C++11 nevertheless. If you remove the return type, that is not valid C++11 by the standard, and the compile fails. If you change the return type to anything other than int, that is not valid C++11 either. main is a special function in C++11, it need not explicitly return. The default, implicit behavior is return 0.

Back to the first.cpp. Here is the compile command:

g++ -Wall -Werror -std=c++11 -g first.cpp -o first.exe

What should the size of the executable be? It is just a NOP instruction – so just one x86_64 instruction – say 64 bits, that is 8 bytes. du -b first.exe gives it as 62352 bytes!! Well that -g compile option puts a lot of debug information (symbol table) in the exe, so if I compile without -g, I get the size as 61660 bytes!! If I explicitly tell the compiler to remove the entire symbol table by a -s command switch, the executable size still comes to 6656 bytes!! Even if I try adding the size optimization switch -Os the size doesn’t go below 6656 bytes on my machine. That is an inflation of 832x from our original estimate of 8 bytes!!

This additional executable size comes from C++ runtime and standard library linked into your executable. Therefore, the first thing to realize is that programming a NOP in C/C++ is not like assembly – there is a runtime overhead which comes from the language itself. An objdump -d first.exe will show you in disassembly that our NOP program has lots of real x86 instructions to execute!!

To make matters more tractable, let’s just compile the program into an object file with

g++ -Wall -Werror -std=c++11 -s -Os -c first.cpp -o first.o

We’ve just compiled our source without linking the standard libraries and runtime. du -b first.o now gives 864 bytes!! Now we are getting somewhere. objdump -d first.o shows us the actual x86 instructions:

first.o: file format pe-x86-64

Disassembly of section .text.startup:


0000000000000000 main:
0: 48 83 ec 28 sub $0x28,%rsp
4: e8 00 00 00 00 callq 9 main+0x9
9: 31 c0 xor %eax,%eax
b: 48 83 c4 28 add $0x28,%rsp
f: c3 retq

That is 5 x86 instructions with 16 bytes.

 

REFERENCES

  1. Stephan Diehl, Pieter Hartel, Peter SestoftAbstract machines for programming language implementation, 1999, http://www.inf.ed.ac.uk/teaching/courses/lsi/diehl_abstract_machines.pdf
  2. ISO C++ Standard, https://isocpp.org/std/the-standard
Posted in Programming | Tagged | Leave a comment

Traumatic memories : This did happen in my Kashmir, many many times over

Basharat Peer, now of Curfewed Nights fame, but then a young reporter at Rediff, covered the gruesome Nadimarg massacre of Kashmiri Pandits in South Kashmir. He titled his piece This did not happen in my Kashmirthe date was Mar 24, 2003, and I was in S.Korea, working for a semiconductor lab as a researcher. The previous decade and more had been very hard. The journey – from Srinagar in 1990 as a Kashmir Hindu migrant at penury to Suwon, S.Korea in 2003 as a researcher in one of world’s leading research labs – had been psychologically traumatic and physically draining. But it was a trauma hidden and not shared. How could you walk with your wounds on the sleeve around your friends and acquaintances. They saw you working hard to solve a technical problem and debug code. Working so hard to somehow buy a place which you can call home. Opening my wounds was too demanding. I was gauche.

But Basharat’s piece triggered the pain in my festered wound. I was angry. Really angry. I wondered why there was no voice of the Kashmiri Hindus – there should have been screams. Gujarat riots had happened recently, and I was seeing a sea of narratives in the public domain. It hurt that Rediff sent a Kashmiri Muslim to cover the massacre. But what hurt most was that the piece sounded false. It was unlike anything I experienced in my life in Kashmir. I did not necessarily doubt Basharat’s intentions and loyalties – I just understood, that he like many others around me, had started giving life to his fantasies in the face of an ugly, all-imposing reality.

So I wept and I wept and I wrote my response to Basharat’s piece while my office mates were around me thinking whatever happened to their jovial hard at work colleague. Here is what I wrote on Mar 25, 2003:

This did happen in my Kashmir, many many times over

Dear Basharat,

Nice sentimental article, befitting our emotions at times like this. I confess, I had tears in my eyes too, and wondered if your and my feelings are so isomorphic (I am a displaced Kashmiri Pandit), why is there so much voilence and misery for our ilk!!

But the sentimental hue washes away, when I reflect back on my own sense of truth and experience. As an example, was it not the same Anantnag, your home district, where there were massive riots against Pandits in 1986, many of their houses and temples burned and destroyed. Infact there was a mini-Pandit migration at that time too. In my living memory, there are too many anti-India-which-translated-into-anti-hindu incidents actuated by Kashmiri Muslim miscreants in the last 30 years. And almost all of those feelings stemmed from intense religious hate. It is just that while earlier those feelings resulted in a wrecked Matador (mini-bus) or a burned house, they are now amplified into killings.

The reason, my friend, is that those maniacs who you mention in your article, have (or had in early 1990’s) a lot of silent/overt support from the Kashmiri Muslim masses for one and one reason alone – they promised Nizam-i-Mustafa: the holy dream of pure Islam, defined only in negation of India and Hindu. And your being a muslim – is the only reason, that while you can work in the Hindu Delhi, and applaud Naipaul’s criticism of Islamic societies, you can still pack a few shirts and jeans and amble back to Kashmir. Whereas, I, a Kashmiri Pandit- although not very religious, and who has a healthy respect and knowledge of Islamic history, am not welcome and will be shot at.

Our history is ripe with many many such horrors, and even on a grander scale – from the maurauding Afghans to Sikandar But-Shikan, there have been countless Hindu Killings and Migrations from Kashmir – to ignore that wound is a generalization of sentimentality

-Samvit Kaul

Why am I writing this today, more than another decade later!! Last Sunday (Sep 25, 2014) I visited Bangalore Literature Festival just to hang out and meet some friends who were visiting. I did not know the program listing, and vaguely remembered that Girish Karnad was supposed to be speaking on his life. I’ve been a big fan of his early work – especially Tughlaq, which had an almost spiritual impact on me during my loneliness in Europe (let’s leave that for another time). Imagine my surprise when I heard Kashmiri poetry being recited as I was loitering around. I followed the voice and that is how I met Nisar Azam, and eventually through him Sameer Arshad, Sunayna Kachroo and a whole lot of other Kashmiris. They told me about a panel discussion on Kashmir in the evening and asked me to participate. So I stayed and chatted with this group.

What happened at the panel discussion on Kashmir made me very sad. There were a lot of well meaning (hopefully) folks who reminded me of the character Jane in Naipaul’s novel Guerrillas – with a deeply flawed sense of reality, and wanting to contribute, with a naive mixture of adventure and piety and moral correctness, and imposing their fantasies. The panel was full of traded cliches and how we can make everything work in future – as it has been in the past.

I was sad but not crushed this time – I’ve long lost the belief that real world conflicts can be solved by talk and discussion or even scholarship. However, in some crevice of my mind, I got reminded of Basharat’s post and my response. I’d forgotten the actual lines I wrote, but I was intensely reminded of my sense of hopelessness and despair then.

So I Googled. And as Linus Trovalds said somewhere “Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ” I found my written words of 2003 and there began a revisit to an old memory. Past leaking into present. But this time there were no tears.

Posted in Kashmir | Tagged , , , | 6 Comments