[HN Gopher] Computer Science from the Bottom Up
       ___________________________________________________________________
        
       Computer Science from the Bottom Up
        
       Author : hliyan
       Score  : 183 points
       Date   : 2021-04-29 14:34 UTC (8 hours ago)
        
 (HTM) web link (www.bottomupcs.com)
 (TXT) w3m dump (www.bottomupcs.com)
        
       | thamalama wrote:
       | https://csunplugged.org/en/
       | 
       | "Computer science" is a terrible name. Astronomy is not called
       | "telescope science", and biology is not called "microscope
       | science".
        
         | tantalor wrote:
         | Telescopes and microscopes are subjects of theory of optics,
         | which studies the nature of light.
         | 
         | The "computer" in "computer science" is referring to turing
         | machines, automata, etc, which are subjects of theory of
         | computation, which studies the nature of computable functions.
        
       | [deleted]
        
       | Joker_vD wrote:
       | Slightly off-topic, but... am I the only one who is troubled by
       | the fact that the explanations of "fork and exec" are almost
       | never, ever, discuss the rationale for picking this design
       | instead of "CreateProcess()" a.k.a.
       | "just_launch_this_exe_file()", or even acknowledge this
       | alternative design? A student would probably expect that to
       | launch an app, there is a system call that would do exactly that:
       | you pass it the name of the application/executable file, and it
       | launches. Instead, there is a small dance of duplicating itself
       | and then re-writing one of the copies--try not to get tangled up
       | in the legs while doing that!
       | 
       | The fork+exec has many unobvious advantages over CreateProcess,
       | but also disadvantages, and one of them is that it's, in my
       | opinion, a completely unexpected approach: I can't think of any
       | other system object that can only be created by first copying
       | another already existing object and then overwriting the newly
       | created one if needed. Files of any kind
       | (folder/pipe/socket/etc)? Memory mappings? Signal handling?
       | Process groups/sessions? Users/groups? The processes seem to be
       | unique in this regard. How did people come up with this approach?
       | That would be an interesting discussion, I think.
       | 
       | Instead, it's just described as the completely self-justified,
       | with description of "zombies" thrown in the end for boots: and
       | the zombie processes are pretty much an accidental artifact of
       | early UNIX design; if the fork was returning not only globally
       | visible (and therefore unstable) PID but also an fd associated
       | with the child process, neither wait/waitpid syscalls nor reaping
       | duties of PID 1 would have been necessary.
        
         | patrec wrote:
         | > The fork+exec has many unobvious advantages over
         | CreateProcess,
         | 
         | What are the advantages you see, that are not rooted in the
         | fact that it's very difficult to create any remotely sane
         | declarative API in a language as imperative and primitive as C?
         | In other words which of these advantages would still apply if
         | you wrote your OS in, say, Ocaml?
         | 
         | Splitting fork and exec allows you to roughly hew the process
         | environment of a clone of the parent into shape, process state-
         | wise, before you sacrifice it to birth the child-process which
         | will inherit many of these desired (and generally a few
         | undesired) traits. So one big advantage is that it allows you
         | to re-use an existing (and growing!) set of imperative commands
         | to modify aspects of a running process to specify the same
         | aspects for a child process,and that piecemeal mutation is
         | basically the only way to express anything of any complexity in
         | C, particularly if you don't want to break API compatibility
         | all the time.
         | 
         | The downside is that you generally end up inheriting a bunch of
         | cruft that you really didn't want to, that the imperative and
         | piecemeal nature opens up problems with race conditions, and
         | that there is a lot of overhead only partially mitigated by
         | various complex hacks (COW, various more "leightweight" fork
         | alternatives, special flags to general purpose system calls
         | that are only there to control behavior upon forking, ...).
        
         | dataflow wrote:
         | What I hate about fork is that it fundamentally doesn't even
         | make sense. You fork a process with a window, what happens to
         | that window? Does it get duplicated? Do both control the same
         | window? It has no sensible behavior in the general case without
         | cloning the whole machine, and even then, your network isn't
         | going to get forked. There are limited cases where it could
         | make sense, but the fact that that's not true in general should
         | make it fairly obvious that we need a different spawning
         | primitive. It boggles my mind that people teach fork as if it
         | could be the primitive.
        
           | megous wrote:
           | Depends on what do you mean by window. If the window lives in
           | a X server and you clone the client, window will obviously
           | not get duplicated.
           | 
           | It's quite clearly defined what gets duplicated and what gets
           | shared on clone()/fork(). There really is no ambiguity.
        
             | Joker_vD wrote:
             | So, do threads get duplicated or shared? Or something
             | altogether different happens to them?
        
               | megous wrote:
               | That's documented in man 2 fork.
        
             | dataflow wrote:
             | I'm not saying a particular implementation is ambiguous as
             | to what it _does_. Obviously any implementation will do
             | _something_. I 'm saying that fork as a _concept_ is
             | ambiguous as to what it _should_ do.
        
               | robotresearcher wrote:
               | POSIX defines what a POSIX compliant fork() must do.
        
               | dataflow wrote:
               | Yes it does.
        
               | megous wrote:
               | You're forking a process, not the whole system. Sorry,
               | but I fail to see the ambiguity. If the window is in the
               | process as some framebuffer, it gets cloned too. That may
               | not mean you'll get two windows on your monitor, because
               | your display engine is HW thing, and not a process.
        
               | dataflow wrote:
               | It's because you're thinking "it does X! it's not
               | ambiguous!" but still missing that the question is over
               | why it SHOULD do X instead of Y, not over _whether_ it
               | does X or Y. To a user, it sure as heck doesn 't make
               | sense to fork a process and still end up with one window
               | for both of them. For lots of processes the process _is_
               | its windows as far as the user is concerned.
        
               | robotresearcher wrote:
               | Fork isn't an idea that users need to have. Devs may need
               | to know what their OS API's fork() does. No-one else
               | cares.
        
               | [deleted]
        
         | kccqzy wrote:
         | Well we do have a library call for launching a new process
         | called posix_spawn(3), but it's complicated mainly because they
         | are a lot of options you want to configure when launching a
         | process: which file descriptors would you like to close, which
         | would you like to share with the newly created process, what
         | uid the new process should have, what working directory it
         | should have, etc. It's just a large and complicated call.
         | 
         | But with fork/exec, all of that setup becomes just normal code
         | you run after fork but before exec. You want the child not to
         | have a certain file descriptor? Just close it after fork. You
         | want to drop privileges when running the child? Same thing,
         | just setuid/setgid/setgroups after fork. You want to set up
         | resource limits for the child? Again just setrlimit after fork.
         | 
         | It avoids a lot of complexity in the system call itself.
         | (Naturally, it adds some other complexity elsewhere.)
        
           | wmf wrote:
           | Besides fork() and spawn() there's another option which is to
           | create an empty process, configure it as needed, then start
           | it. IMO this is what simple, orthogonal primitives looks
           | like. This generally isn't possible in Unix but AFAIK it's
           | possible in Mach.
        
           | Joker_vD wrote:
           | Yes, I know. But you only figure it out after you've done you
           | fair share of IPC and management of worker processes by hand.
           | When you're a fresh student, this "fork+exec" just seems like
           | a pretty ridiculous way to organize things: why not just
           | launch the executable you want to launch immediately? Nope,
           | that's at best postponed until the chapter on IPC, at worst
           | it's never discussed at all, so you're left puzzled and with
           | "okay, I guess that's how things are done, if you say so..."
           | feeling.
           | 
           | Oh, and by the way: we have open()/fcntl() for opening files
           | and then fiddling with their settings instead of one open()
           | call with 20 arguments; we could easily have had launch()
           | with the same parameters as execve() that would launch the
           | new process suspended, then we could use... I dunno, even
           | fcntl() on the process's descriptor to configure all those
           | things and then send it SIGCONT when we're done setting it
           | up.
        
             | kccqzy wrote:
             | When you are a fresh student, and when fork/exec is too
             | advanced for you, just use system(3). Not recommended for
             | production use, but that's indeed the easiest way to launch
             | an executable.
             | 
             | Do we optimize our system call design for fresh students or
             | for professionals?
        
               | Joker_vD wrote:
               | We were talking about the presentation of "fundamentals"
               | in the courses on Computer Science, right? The fork/exec
               | design is highly non-trivial and so should probably
               | deserve more discussion, with examples of alternative
               | designs and design considerations, than mere "that's how
               | new processes are created, nothing special here, let's
               | move on".
        
               | kps wrote:
               | If you want to look at it as 'fundamentals', fork only
               | requires the concept of a task, whereas your top-level
               | comment's suggestion also requires (a) secondary storage,
               | (b) organized into files, (c) that can be named.
               | 
               | (That's not _why_ Unix developed with fork, though.)
        
               | Joker_vD wrote:
               | The (b) and (c) are not really needed: early timesharing
               | systems managed to launch processes straight from the
               | punched card decks IIRC. And without (a), what do you
               | even need the second, identical task for? I guess it
               | could be used parallelize crunching the numbers, but
               | that's it; and probably that should perhaps be done from
               | the system operator layer anyway? Like, "launch 5 copies
               | of program on this deck but mark one specific slot in
               | memory different for them, so they would know their ID".
        
         | khaledh wrote:
         | From "Operating Systems: Three Easy Pieces" chapter on "Process
         | API" (section 5.4 "Why? Motivating The API") [1]:
         | ... the separation of fork() and exec() is essential in
         | building a UNIX shell,         because it lets the shell run
         | code after the call to fork() but before the call         to
         | exec(); this code can alter the environment of the about-to-be-
         | run program,         and thus enables a variety of interesting
         | features to be readily built.              ...              The
         | separation of fork() and exec() allows the shell to do a whole
         | bunch of         useful things rather easily. For example:
         | prompt> wc p3.c > newfile.txt                  In the example
         | above, the output of the program wc is redirected into the
         | output         file newfile.txt (the greater-than sign is how
         | said redirection is indicated).         The way the shell
         | accomplishes this task is quite simple: when the child is
         | created, before calling exec(), the shell closes standard
         | output and opens the         file newfile.txt. By doing so, any
         | output from the soon-to-be-running program wc         are sent
         | to the file instead of the screen.
         | 
         | [1] https://pages.cs.wisc.edu/~remzi/OSTEP/cpu-api.pdf
        
       | kingkongjaffa wrote:
       | In the beginning, bell labs invented the transistor...
        
         | tachyonbeam wrote:
         | I mean there were vacuum tubes and relay machines before that
         | :)
         | 
         | IMO relays are a bit easier to understand than transistors
         | because it's a bit easier to understand how to implement
         | various logic gates with them with only very basic knowledge of
         | electronics.
         | 
         | Also, very satisfying clicking sounds:
         | https://www.youtube.com/watch?v=JZyFSrNyhy8&t
        
       | ggggtez wrote:
       | This is not computer science. I see nary a reference to data
       | structures or algorithms, or math of any kind.
       | 
       | These are all engineering concerns, not science.
        
         | globular-toast wrote:
         | Computer science isn't science.
        
           | jll29 wrote:
           | A program often is a model.
        
         | neatze wrote:
         | hmm, applied science is not engineering ?
        
           | ravi-delia wrote:
           | Applied science is engineering, by and large, but it's
           | worthwhile to have different words for different things.
           | 'Engineering' is one, 'Science' is another, even though the
           | latter is useful to the former.
           | 
           | Of course, Computer Science is almost no science, most
           | degrees are some blend of math and engineering.
        
             | jll29 wrote:
             | As with anything that entirely depends who is at work.
             | Watching Linus Torvalds is going to make you arrive at your
             | answer, but if you followed Don Knuth, you may have to
             | reconsider.
        
       | mikedilger wrote:
       | I'm disappointed this didn't start from "sand", quantum
       | mechanics, silicon doping, electromagnetism, electronics,
       | transistors, gates and latches, etc.
       | 
       | Also, Computer Science is a much broader field than what this
       | website addresses including very little science but actually a
       | mixture of math (e.g. P vs NP stuff) and engineering (both
       | hardware and software), this website addressing a subset of the
       | engineering part.
        
       | optimalsolver wrote:
       | I think these CS books should start with a philosophical look at
       | what kind of thing computation actually is, rather than diving
       | straight into UNIX file system structures with barely any
       | context.
       | 
       | But I guess that would be computer science from the top down.
        
         | maweki wrote:
         | I am a theorist and I would say, that the bottom - as it is the
         | foundation - is mathematical logic. And you start from there.
         | 
         | Baffingly enough: All this theory is easily understood with
         | 7th-grade math.
         | 
         | Approaching from some specific piece of engineering is surely
         | top-down.
        
         | akersten wrote:
         | Yeah, the first page of this book being "Everything is a File
         | in Unix" is already so high-level and far above what I would
         | call the "bottom-up" of computer science that this book should
         | be renamed "Unix from the bottom-up."
         | 
         | The "bottom" of computer science would be mathematically
         | explaining what it means to compute, and building up to the
         | abstraction of a Turing machine, and eventually getting to a
         | concrete implementation of what we call programming and code.
         | This book has code on page 2 - but code is nowhere near the
         | bottom.
        
         | hctaw wrote:
         | Good book on that is _Introduction to the Theory of
         | Computation_ by Michael Sipser.
         | 
         | One could argue it's from the _very_ bottom up, not much
         | practicum lies in that text.
        
           | zaksingh wrote:
           | This book was one of the best parts of my CS degree. It's the
           | only textbook which I read cover-to-cover because the writing
           | and narrative were so cohesive. Highly recommend.
        
       | pc9 wrote:
       | Does anyone have a comparison between this and Computer Systems:
       | A Programmer's Perspective? I was planning to read that book and
       | this looks like a similar resource.
        
         | qntty wrote:
         | CSAPP is like the theory behind low-level (C/C++/etc)
         | programming. This is more like a sketch of the concepts behind
         | operating systems. CSAPP is also much longer and more
         | technical. Where they overlap CSAPP is going to be much more
         | complete.
        
       | mariodiana wrote:
       | I can't be the only one, but I feel pretty sure I'm in a minority
       | among computer programmers. This "bottom-up" approach is just not
       | the way my brain works. I definitely feel I'm a top-down sort of
       | thinker and learner, and this puts me at a disadvantage,
       | regarding the literature and approach in this profession.
       | 
       | Are the terms describing the two approaches here "empirical"
       | versus "rational"? I'm talking about building bit by bit as
       | opposed to getting an overview and then drilling down. I
       | definitely feel like I benefit from the second. And I almost feel
       | like it may not be too strong to suggest that I'm actively harmed
       | by the first.
       | 
       | Something like the author's approach is _great,_ for my purposes,
       | when I already pretty much understand something. But I can 't
       | _start_ here.
       | 
       | Again, I feel like I'm in a minority. But anyone else have this
       | turn of mind, too? How do you cope? Do you just look for other
       | materials?
        
         | rambambram wrote:
         | You put into words what I've been feeling for a long time. So I
         | guess you're not alone. I 'cope' by just turning it around:
         | finding some meaningful project, decide what I want it to be,
         | and then find out one step at a time how to get closer. That
         | starts with me finding out - and then being comfortable enough
         | with - a big overview of the project. Somehow I keep coming
         | back at the 'work is like a hill' metaphor from Ryan Singer in
         | his book Shape Up (from basecamp).
         | https://basecamp.com/shapeup/3.4-chapter-13#work-is-like-a-h...
        
         | AlanSE wrote:
         | I think it would be good to re-package this information, but
         | for me, the critical thing that stuff like this accomplishes is
         | comprehensiveness. You have to go through so many classes at
         | school just to have all the knowledge base covered at all.
         | Everything is compartmentalized by default, so that's kind of
         | the mother of all bottom-up approaches.
         | 
         | Top-down takes time and reflection, and re-packaging.
        
       | Jtsummers wrote:
       | Past discussions with comments (including the author addressing
       | the naming issue):
       | 
       | https://news.ycombinator.com/item?id=13249675 (2016, 157
       | comments)
       | 
       | https://news.ycombinator.com/item?id=21903007 (2019, 77 comments)
        
         | dang wrote:
         | Also
         | 
         |  _Computer Science from the Bottom Up_ -
         | https://news.ycombinator.com/item?id=7611093 - April 2014 (44
         | comments)
        
       | brundolf wrote:
       | This seems highly focused on systems; I wouldn't call it
       | representative of "Computer Science" as a whole. Only maybe 2-3
       | of the classes in my computer science degree were focused on this
       | kind of stuff.
       | 
       | Still seems like a good resource, just perhaps mislabeled.
        
         | macintux wrote:
         | I'm surprised you had that many. My CS program had nothing
         | comparable that I can remember.
         | 
         | Definitely not computer science.
        
           | brundolf wrote:
           | We had one called "Systems Programming" that was a whirlwind
           | intro to everything Unix (6 weeks on bash, 6 weeks on C, 6
           | weeks on Perl, a mishmash of SSH and everything-is-a-file and
           | Unix-philosophy, etc)
           | 
           | We had another called "Operating Systems" that focused on
           | things like concurrency, building a really basic memory
           | pager, etc (mix of C and Java I think)
           | 
           | And we had one other called "Networking" that was sort of
           | along the same lines as "Operating Systems" but focused on,
           | well, networking. Mostly Java; we wrote our own application-
           | layer protocols, that sort of thing
           | 
           | But yeah pretty much everything else - including the more
           | "concrete" programming courses like Data Structures and
           | Algorithms - was fairly abstracted away from any specific
           | system
        
             | macintux wrote:
             | Nice, and now that I think of it we did have an operating
             | systems class. I know better than to trust 25-year-old
             | memories.
        
               | michaelcampbell wrote:
               | My undergrad CS degree I don't think contained any sort
               | of OS course, other than maybe a survey of them but there
               | was a graduate level class that was required for any
               | higher degree. I took it before I graduated with an
               | undergrad because it was interesting, and it was quite
               | extensive as I recall. That was well over 20 years ago,
               | however.
        
             | ivankolev wrote:
             | That sounds a lot like part of the UofT CS program.
        
         | elcapitan wrote:
         | Yeah it's basically an Operating Systems book like Tanenbaum's
         | "Modern Operating Systems".
        
         | zerkten wrote:
         | > This seems highly focused on systems; I wouldn't call it
         | representative of "Computer Science" as a whole.
         | 
         | I'd agree with this, but it fits with how a certain set of
         | schools teaches their core. They will teach a set of related
         | classes around the topics here: computer architecture,
         | operating systems, and networking, because this is often
         | central to their research interests. Others will scatter this
         | information across a number of unrelated courses to check the
         | box, or skip it entirely because their research doesn't touch
         | as heavily on these topics.
         | 
         | Further research on the author suggests they are from New
         | Zealand. I'm from the UK and live in the US. What I see taught
         | in US CS degrees is somewhat different from what I saw in the
         | UK, and heard about from friends in continental Europe. I
         | imagine the author's experience is similar.
         | 
         | A further bias from the author, and I'm making big assumptions,
         | is that see value here because they've worked in places that do
         | lot of system programming. Mentions of VMware and Red Hat on
         | https://www.wienand.org.
        
         | userbinator wrote:
         | Yes, the same comment appears whenever this comes up on HN.
         | Bottom-up would actually be something more like Nand2Tetris or
         | Petzold's _code_.
        
         | _wldu wrote:
         | Years ago, systems was the general purpose CS degree. I guess
         | it is/was classical CS. Today, there are many specialties
         | (data, security, algorithms, etc.), but many top-ranked CS
         | schools still offer a traditional systems degree.
        
         | Yottaqubyter wrote:
         | The reason for the name seems to be because of its focus on
         | systems. I do agree that the way it's labeled isn't the best,
         | but it's not trying to be representative of all Computer
         | Scionce.
         | 
         | In fact, it seems to specifically be trying to have a different
         | focus than a cs degree:
         | 
         | > This book aims to move in completely the opposite direction,
         | working from operating systems fundamentals through to how
         | those applications are complied and executed.
        
           | brundolf wrote:
           | "opposite direction" tends to imply "the same content, just
           | with a different order/framing". Whereas the OP leaves out
           | the majority (not all, but more than half) of what people
           | tend to put under the term "computer science" completely
        
       | ChrisArchitect wrote:
       | anything new added?
       | 
       | Some earlier discussion:
       | 
       | 1 year ago https://news.ycombinator.com/item?id=21903007
       | 
       | 4 years ago https://news.ycombinator.com/item?id=13249675
       | 
       | 7 years ago https://news.ycombinator.com/item?id=7611093
        
       | wildmanx wrote:
       | As much as I appreciate such material, but this is not "computer
       | science". Maybe "computer engineering", or some mix of "software
       | engineering", "systems engineering", those things.
       | 
       | Computer science is about algorithms, complexity, possibly
       | foundations of programming languages, cryptography, database
       | theory, those things. Not about Unix or file formats.
       | 
       | Those are important too, but not "computer science".
        
       | wrnr wrote:
       | The ELF binary format, really? This is like a course advertising
       | itself as an introduction to normative ethics and proceeding to
       | lecture people on canon law in the 14th century in Flanders. Like
       | sure if you into obscure shit this might be interesting but dunno
       | what one has to do with the other.
        
         | Joker_vD wrote:
         | Well, more like the common law in the 21st century in the USA.
         | Quite useful in nowadays practice, but obviously neither
         | universal nor eternal. But could be a good starting point as a
         | running example, or a reference point when comparing to the
         | "civil law" systems.
        
         | megous wrote:
         | Not sure what's obscure about ELF. Almost every one of the
         | programs I use is wrapped in ELF format, except for maybe a
         | bootloader and scripts.
         | 
         | If you want to learn from bottom up, then learning what
         | programs are seems like an obvious approach.
        
       | snicker7 wrote:
       | Looks more like computer engineering than computer science.
        
         | MithrilTuxedo wrote:
         | Yeah, that kinda bothered me too.
         | 
         | >Computer Science is no more about computers than astronomy is
         | about telescopes.
        
       | xbar wrote:
       | Similar but different from
       | https://www.bartlettpublishing.com/site/product/programming-...
        
       | beckman466 wrote:
       | Just read page one and two and I already LOVE the conversational
       | style of writing from the author. Who the hell decided we should
       | write textbooks in such a dry and formal way (this has been my
       | experience). Maybe the textbooks fail me because the author is
       | not able to translate their excitement into text (I have ADD, so
       | there's that too)? Of course it's also due the nature of the
       | copyright/IP system, together with the hierarchies inherent to
       | capitalism. The competition between workers selling their labor
       | works to the advantage of employers/capitalists.
       | 
       | Anyway, learning is basically just a long conversation where we
       | learn about the feedback loops about whatever concept or system
       | we are trying to understand and interact with. For me this
       | author's style (and others like him) works a million times better
       | to do that in an intuitive way.
        
       ___________________________________________________________________
       (page generated 2021-04-29 23:01 UTC)