30 years of C

awolf · on Sept 22, 2009

"Don’t leave unused code hanging around because it might be useful someday. If it’s not being used right now, remove it, because even just sitting there it’s costing you."

Couldn't agree more - every extra line makes the source code that much harder to wrap your brain around.

If I glance at a section in my code and can't instantly grasp what it is doing and how it fits into the bigger picture then I know I still have refactoring to do.

DarkShikari · on Sept 22, 2009

And version control keeps it around for you anyways.

I can't even count the number of times I've heard "but this code will be useful if/when we add support for feature X" used to justify hundreds or even thousands of lines of ifdeffed-out cruft.

sunkencity · on Sept 22, 2009

I worked with someone who never threw old code away. It was amusingly difficult to try to understand what his code did. Typically shitloads of php files that didn't do anything and in the files that actually did something stuff like

   if (false) { ... large incomprehensible code block .. }

was common. Why comment out stuff when you can just disable it programmatically?

davidw · on Sept 22, 2009

I did some work at a place where they had a PHP class called something like "foobar_2004_08". Given that it was 2006, I asked about axing it, but was told it was still in use. Shudder

ajb · on Sept 23, 2009

There is actually a reason for doing that, as opposed to commenting it out - at least in compiled languages, the compiler will complain if code changes elsewhere conflict with it, so it doesn't bitrot as fast.

scotty79 · on Sept 22, 2009

The cause could be that inside the block there were already the /* */ comments, and editor did not have single keystroke option to add // at the beginning of each selected line.

sunkencity · on Sept 22, 2009

no. he used Eclipse. Apple-7 would have worked.

scotty79 · on Sept 22, 2009

Ok. So I have another theory. In place of false there was previously some ugly condition and he wanted get rid of the things that were done on this condition with minimal necessary effort hoping that someone in the future with more time (maybe even himself) will clean that up since doing cleanup in such case is not hard thing to do.

theli0nheart · on Sept 22, 2009

Ugh, makes me sick to my stomach.

fogus · on Sept 22, 2009

Yes, version control does keep it around for you, but often it is not the easiest thing to associate some piece of code with a repo version. I am not advocating keeping chunks of commented-out code around; just saying that I have yet to find an optimal solution to the other side of the problem either.

cema · on Sept 22, 2009

An easy solution is to split it off to an "Unused" pseudoproject in the source control system. This avoids archeology on historical versions.

joeyo · on Sept 23, 2009

  > This avoids archeology on historical versions.

It's not really your main point, but the idea of revision control archeology is a really a great analogy.

holygoat · on Sept 23, 2009

I favor a 'scratch' or 'junk' dir in each project. Source that I know I don't need now, but want to keep around. grepping for terms will often find a chunk of code I wrote that nearly solves a new problem.

Version control is not so easy to search, particularly if you're looking for a utility that you think is live in the project.

cema · on Sept 23, 2009

Yes, grepping a folder is easy. Of course, you can always update it from the source control database. :-)

InclinedPlane · on Sept 22, 2009

On the next episode of Code Hoarders...

fogus · on Sept 22, 2009

" When I worked for Apple a few years later I was dismayed that it was a Pascal shop..."

I never knew this, interesting tidbit. Like many programmers I've read my share of Apple history books, but somehow Apple Pascal eluded me.

cesare · on Sept 22, 2009

Yep. Much of the original Mac OS was written in Pascal.

I still have a boxed copy of Symantec Think Pascal. I used it to make my first steps in window based/event driven applications programming on the Mac (in 1990).

I remember coding my first apps invoking the system calls directly to create windows, handling events (see where the user clicked) etc. Then I moved to Codeworrior which had a nice c++ framework.

drewr · on Sept 22, 2009

If my high school CS teacher had told me that Mac OS had been written in Pascal, I would have been slightly more enthused about learning it. I had a similar, nebulous unease with it as dadhacker and really enjoyed C when I got to college.

cesare · on Sept 22, 2009

I dropped out of college (CS) instead, also because of an episode related to this.

An assignment for an exam was to make a simple library management program in pascal (cataloging books, leading, receiving them back etc.), mainly to test our understanding of pointers.

Even if we also had Macs at the university, we've been asked to make a command line app. But, since I was studying the Mac OS gui programming by myself, I made a graphical application instead.

The assistant professor that was evaluatings our assignment didn't believe that I had made it myself. She probably couldn't understand it herself (but, obviously, all the relevant functions were separated from the gui code). So she asked me to rewrite everything from scratch as a command line application in a couple of hours in front of her.

I managed to do it but I was really really pissed off. Shortly thereafter I started working and dropped out.

barrkel · on Sept 22, 2009

You ignored the requirements and demonstrated expertise in the wrong area, IMO.

In a similar situation, I wrote the required command-line app, but it handled queries, basically parsing to an expression tree predicate.

Of course, it doesn't matter now.

cesare · on Sept 22, 2009

> You ignored the requirements and demonstrated expertise in the wrong area, IMO.

The assignment was about implementing the functionalities of a simple library management app. And it was explicitly to test using pointers, linked lists ecc.

The requirements were fully implemented. And all the queries were handled by functions which were completely separated from the gui code.

Basically, instead of a textual menu (1 - borrow, 2 - return etc.) you had a Mac Os menu and the results were presented inside fields in a window instead of a textual output.

If I had to convert the program to a textual interface by editing my source (instead of doing everything from scratch) it would have took me 5 minutes (or less). IIRC the code was already there and simply commented out.

Keep also in mind that our lab was full of Macs (50-60 machines) with just a PC (which, btw, has been used by a guy who had made the app with his own advanced textual interface - something like ncurses - and who had a similar fate as mine). So I thought it was a plus to use the native gui of the OS we were using all the time.

> Of course, it doesn't matter now.

Of course. It happened 17 years ago :-)

nitrogen · on Sept 23, 2009

In my experience, CS professors like automating the grading process as much as possible. It's conceivable that your assistant professor was upset she couldn't just run your app through the grading script.

cconstantine · on Sept 23, 2009

We had something similar at Purdue. In fact, we were frequently given an example 'correct' program whose output we needed to match exactly. It took me much longer to get register assignment in my compilers' class to match the professor's results than it took me to get it 'correct'.

This gets to interfaces though. I don't know about the gp, but if the assignment specifies a particular interface then that becomes a part of the spec. It doesn't matter that you made a more advanced interface, if it doesn't match the required interface it's wrong.

theli0nheart · on Sept 22, 2009

What's incredible is that a lot of the time, GUI-based applications are only a smidgen more complicated than their command-line counterparts. Sad that your professor didn't seem to know this.

DrJokepu · on Sept 23, 2009

as he mentionned, this happened 17 years ago. While I never programmed the original Mac OS, I know that Win16 GUI programming wasn't the simplest thing ever at all - it was considerably more complicated that parsing a command line; you had to deal with message queues, pass pointers around with no nice OO interfaces.

billswift · on Sept 23, 2009

You might want to reread the part of the post on the importance of spelling. I am not familiar with program names - is that supposed to be Codeworrier or Codewarrior, or am I wrong and it's spelled correctly.

cesare · on Sept 23, 2009

Yes it's spelled Codewarrior. Sorry.

cesare · on Sept 23, 2009

s/Codeworrior/Codewarrior/g

huhtenberg · on Sept 22, 2009

Ever wondered why almost all of Win32 API is __stdcall ? :)

coliveira · on Sept 22, 2009

I like this part: "Template meta-programming is a great example of people being clever without being responsible."

cturner · on Sept 22, 2009

    can’t help but think that Sun continued to blow its
    opportunities here, for years.

Usually I jump on the bash-java bandwagon. However, I think the author's wrong about this. Java is good in a team setting, and Java is the best choice for the enterprise.

I support a cross-platform, plugin-oriented Java application with a large install base and wouldn't want it written in anything else.

In C applications, it happens that applications will have segfaults, and threaded log less helpfully on the way down whereas in java there's a command that allows you to analyse threads of a running process. When there's a Java problem I have much better odds of getting my head around the source than I would for the sort of macro-affected C source that you'd need for an application run in this setting. If it were to run out of memory it would be clear about it. When you're stuck with libraries where the source and developers are long gone it's realistic to use a decompiler. With some preparation at dev time you can use the classpath and classloaders to hot-patch.

C is good for programmers, but not so good for the other technical workers. And I don't think C is as much of a big deal as it used to be due to the advances in scripting languages. I do a fair bit of programming oriented around unix system calls, all in python. Sometimes I find the documentation is better in C, and mock it up there, and then carry it back to python where I can do more with it. Also, if you're on Bigco's standard-issue Solaris host you're far more likely to find perl than a C compiler. Computers are faster than they used to be. Raw execution speed and memory footprint of a process (areas where C is stronger) are much less relevant to scale issues than they used to be.

Reply said:

    The author is comparing Java to C#, not C.

Er - ah - so he did. Thanks for being polite about it.

arohner · on Sept 22, 2009

The author is comparing Java to C#, not C.

kabdib · on Sept 23, 2009

If it matters, I happen to agree with you, as what you said holds true for Java-vs-C as well as for C#-vs-C.

For Java-vs-C#, I hold my ground :-)

I tend to hack up Perl-like tools in C# these days, using a small regexp-centric framework that also handles command line parsing. It's not much more code, and I feel better about it in my current community.

ivankirigin · on Sept 22, 2009

I "know" C. But are there any good resources on learning to build real systems in C? I don't like C++.

wooby · on Sept 22, 2009

The best resource I've found was "The Unix Programming Environment," which goes beyond the C language and into using common libraries and organizing C projects with make and other tools.

Other than that, some food for thought on use of C in the real world is this article by Rob Pike, "Notes on Programming in C:"

http://www.lysator.liu.se/c/pikestyle.html

nkurz · on Sept 22, 2009

I've found some books like Maguire's "Writing Solid Code" and McConnell's "Code Complete" to be helpful, but your best resources are probably the source code to systems similar to your definition of 'real'. Depending on this definition, I'd suggest looking at the source for SQLite (compact and rock solid), Apache (customizable request handling), Perl (for polymorphic data types), and Linux (a big modular system). Basically, pick any of the open source tools you use, and see how they've solved the problem you are interested in!

far33d · on Sept 23, 2009

I learned all I know of C from doing the projects for CS167/9 at Brown. The lectures, and more importantly, the project assignments, are online:

http://www.cs.brown.edu/courses/cs167/lect.old.08.shtml http://www.cs.brown.edu/courses/cs167/asgn.shtml

snorkel · on Sept 22, 2009

The single most glossed over topic in C textbooks how to avoid using fixed-sized buffers and arrays, and instead use dynamic allocation of string buffers, arrays, and dynamic list structures. Kenneth Reek's book "Pointers on C" is excellent for understanding this in vivid detail. You can not read that book and then claim to be confused by pointers ever again.

sunkencity · on Sept 22, 2009

Look at successful open source projects and read the source.

silentbicycle · on Sept 22, 2009

Some successful open source projects have dreadful source, though.

There are several threads on HN recommending source to read. (Off the top of my head, I'd recommend the source for Lua and OpenBSD's userland utilities.)

thamer · on Sept 22, 2009

Lua is often recommended, but the one I've seen the most praise for is SQLite.

brianobush · on Sept 22, 2009

Lua a nice start, here is a walk-through that I have saved from somewhere:

- lmathlib.c, lstrlib.c: get familiar with the external C API. Don't bother with the pattern matcher though. Just the easy functions.

- lapi.c: Check how the API is implemented internally. Only skim this to get a feeling for the code. Cross-reference to lua.h and luaconf.h as needed.

- lobject.h: tagged values and object representation. skim through this first. you'll want to keep a window with this file open all the time.

- lstate.h: state objects. ditto.

- lopcodes.h: bytecode instruction format and opcode definitions. easy.

- lvm.c: scroll down to luaV_execute, the main interpreter loop. see how all of the instructions are implemented. skip the details for now. reread later.

- ldo.c: calls, stacks, exceptions, coroutines. tough read.

- lstring.c: string interning. cute, huh?

- ltable.c: hash tables and arrays. tricky code.

- ltm.c: metamethod handling, reread all of lvm.c now. You may want to reread lapi.c now.

- ldebug.c: surprise waiting for you. abstract interpretation is used to find object names for tracebacks. does bytecode verification, too.

- lparser.c, lcode.c: recursive descent parser, targetting a register-based VM. start from chunk() and work your way through. read the expression parser and the code generator parts last.

- lgc.c: incremental garbage collector. take your time.

silentbicycle · on Sept 22, 2009

That's from Mike Pall's guide to the Lua source, if you're curious. (He's the author of LuaJIT.)

silentbicycle · on Sept 22, 2009

I've heard that too, but haven't read any of its source myself. (The OpenBSD userland is particularly good because it has a lot of utilities that are useful but small, and whose code can be read in isolation.)

tedunangst · on Sept 22, 2009

I have a lot of praise for SQLite the product, but wouldn't extend it to using the source as a learning aid.

nkurz · on Sept 23, 2009

I would. The approach to testing is fantastic, the style is extremely consistent, the comments and commentary are well thought out, and I find it a good compromise between clarity and efficiency. Why wouldn't you recommend it?

sunkencity · on Sept 22, 2009

>Some successful open source projects have dreadful source, though.

Still a good learning experience, but yeah, it would be preferable to choose something well known for quality code. I learnt a lot from programming an apache2 module. Supposedly the apache code isn't the best C project available, but there are some interesting real-world aspects of it.

jcw · on Sept 23, 2009

I'm trying to do this myself--I taught myself C, but still can't write any non-trivial programs. Some people find C easy to comprehend, and think naturally in terms of pointers. Not me.

Gnu Ed's source code is quite easy to read. Plan 9's utilities also have very clear, concise source code.

ivankirigin · on Sept 22, 2009

Like what? Memcache is written in C, right? Python itself?

Anyone actually follow this method want to chime in?

neilc · on Sept 22, 2009

PostgreSQL is the best large C codebase I've ever seen.

jerf · on Sept 22, 2009

Look for anything where there are multiple groups using some backend. libpurple is pretty good, both in that it's not perfect, but it's pretty good.

Anything where there's just one group using it lets too much stuff get through.

there · on Sept 22, 2009

i think the key is to find something secure (and thus, "correct") and not doused with tons of portability goo (ifdefs, etc.)

look at openssh, but the one in openbsd's tree and not the portable branch. for that matter, any of the openbsd-developed daemons (openbgpd, openntpd, etc.) at http://www.openbsd.org/cgi-bin/cvsweb/src/usr.sbin/ or usr.bin/ssh for openssh.

any of the openwall projects are also small and easy to work with - http://www.openwall.com/

henryprecheur · on Sept 23, 2009

Secure & correct programs don't always have a nice code.

For example qmail is probably the most secure smtp server around. But the code is not easy to read.

Pretty much all the stuff Dan Bernstein writes is secure and correct. But his coding style is just too weird for me and many others.

there · on Sept 23, 2009

yes, i avoided listing djb's stuff for that very reason.

dschobel · on Sept 22, 2009

I imagine you'd get the most utility out of looking at the source of a system doing something similar to what you're trying to build.