[HN Gopher] Computer Science from the Bottom Up ___________________________________________________________________ Computer Science from the Bottom Up Author : hliyan Score : 183 points Date : 2021-04-29 14:34 UTC (8 hours ago) (HTM) web link (www.bottomupcs.com) (TXT) w3m dump (www.bottomupcs.com) | thamalama wrote: | https://csunplugged.org/en/ | | "Computer science" is a terrible name. Astronomy is not called | "telescope science", and biology is not called "microscope | science". | tantalor wrote: | Telescopes and microscopes are subjects of theory of optics, | which studies the nature of light. | | The "computer" in "computer science" is referring to turing | machines, automata, etc, which are subjects of theory of | computation, which studies the nature of computable functions. | [deleted] | Joker_vD wrote: | Slightly off-topic, but... am I the only one who is troubled by | the fact that the explanations of "fork and exec" are almost | never, ever, discuss the rationale for picking this design | instead of "CreateProcess()" a.k.a. | "just_launch_this_exe_file()", or even acknowledge this | alternative design? A student would probably expect that to | launch an app, there is a system call that would do exactly that: | you pass it the name of the application/executable file, and it | launches. Instead, there is a small dance of duplicating itself | and then re-writing one of the copies--try not to get tangled up | in the legs while doing that! | | The fork+exec has many unobvious advantages over CreateProcess, | but also disadvantages, and one of them is that it's, in my | opinion, a completely unexpected approach: I can't think of any | other system object that can only be created by first copying | another already existing object and then overwriting the newly | created one if needed. Files of any kind | (folder/pipe/socket/etc)? Memory mappings? Signal handling? | Process groups/sessions? Users/groups? The processes seem to be | unique in this regard. How did people come up with this approach? | That would be an interesting discussion, I think. | | Instead, it's just described as the completely self-justified, | with description of "zombies" thrown in the end for boots: and | the zombie processes are pretty much an accidental artifact of | early UNIX design; if the fork was returning not only globally | visible (and therefore unstable) PID but also an fd associated | with the child process, neither wait/waitpid syscalls nor reaping | duties of PID 1 would have been necessary. | patrec wrote: | > The fork+exec has many unobvious advantages over | CreateProcess, | | What are the advantages you see, that are not rooted in the | fact that it's very difficult to create any remotely sane | declarative API in a language as imperative and primitive as C? | In other words which of these advantages would still apply if | you wrote your OS in, say, Ocaml? | | Splitting fork and exec allows you to roughly hew the process | environment of a clone of the parent into shape, process state- | wise, before you sacrifice it to birth the child-process which | will inherit many of these desired (and generally a few | undesired) traits. So one big advantage is that it allows you | to re-use an existing (and growing!) set of imperative commands | to modify aspects of a running process to specify the same | aspects for a child process,and that piecemeal mutation is | basically the only way to express anything of any complexity in | C, particularly if you don't want to break API compatibility | all the time. | | The downside is that you generally end up inheriting a bunch of | cruft that you really didn't want to, that the imperative and | piecemeal nature opens up problems with race conditions, and | that there is a lot of overhead only partially mitigated by | various complex hacks (COW, various more "leightweight" fork | alternatives, special flags to general purpose system calls | that are only there to control behavior upon forking, ...). | dataflow wrote: | What I hate about fork is that it fundamentally doesn't even | make sense. You fork a process with a window, what happens to | that window? Does it get duplicated? Do both control the same | window? It has no sensible behavior in the general case without | cloning the whole machine, and even then, your network isn't | going to get forked. There are limited cases where it could | make sense, but the fact that that's not true in general should | make it fairly obvious that we need a different spawning | primitive. It boggles my mind that people teach fork as if it | could be the primitive. | megous wrote: | Depends on what do you mean by window. If the window lives in | a X server and you clone the client, window will obviously | not get duplicated. | | It's quite clearly defined what gets duplicated and what gets | shared on clone()/fork(). There really is no ambiguity. | Joker_vD wrote: | So, do threads get duplicated or shared? Or something | altogether different happens to them? | megous wrote: | That's documented in man 2 fork. | dataflow wrote: | I'm not saying a particular implementation is ambiguous as | to what it _does_. Obviously any implementation will do | _something_. I 'm saying that fork as a _concept_ is | ambiguous as to what it _should_ do. | robotresearcher wrote: | POSIX defines what a POSIX compliant fork() must do. | dataflow wrote: | Yes it does. | megous wrote: | You're forking a process, not the whole system. Sorry, | but I fail to see the ambiguity. If the window is in the | process as some framebuffer, it gets cloned too. That may | not mean you'll get two windows on your monitor, because | your display engine is HW thing, and not a process. | dataflow wrote: | It's because you're thinking "it does X! it's not | ambiguous!" but still missing that the question is over | why it SHOULD do X instead of Y, not over _whether_ it | does X or Y. To a user, it sure as heck doesn 't make | sense to fork a process and still end up with one window | for both of them. For lots of processes the process _is_ | its windows as far as the user is concerned. | robotresearcher wrote: | Fork isn't an idea that users need to have. Devs may need | to know what their OS API's fork() does. No-one else | cares. | [deleted] | kccqzy wrote: | Well we do have a library call for launching a new process | called posix_spawn(3), but it's complicated mainly because they | are a lot of options you want to configure when launching a | process: which file descriptors would you like to close, which | would you like to share with the newly created process, what | uid the new process should have, what working directory it | should have, etc. It's just a large and complicated call. | | But with fork/exec, all of that setup becomes just normal code | you run after fork but before exec. You want the child not to | have a certain file descriptor? Just close it after fork. You | want to drop privileges when running the child? Same thing, | just setuid/setgid/setgroups after fork. You want to set up | resource limits for the child? Again just setrlimit after fork. | | It avoids a lot of complexity in the system call itself. | (Naturally, it adds some other complexity elsewhere.) | wmf wrote: | Besides fork() and spawn() there's another option which is to | create an empty process, configure it as needed, then start | it. IMO this is what simple, orthogonal primitives looks | like. This generally isn't possible in Unix but AFAIK it's | possible in Mach. | Joker_vD wrote: | Yes, I know. But you only figure it out after you've done you | fair share of IPC and management of worker processes by hand. | When you're a fresh student, this "fork+exec" just seems like | a pretty ridiculous way to organize things: why not just | launch the executable you want to launch immediately? Nope, | that's at best postponed until the chapter on IPC, at worst | it's never discussed at all, so you're left puzzled and with | "okay, I guess that's how things are done, if you say so..." | feeling. | | Oh, and by the way: we have open()/fcntl() for opening files | and then fiddling with their settings instead of one open() | call with 20 arguments; we could easily have had launch() | with the same parameters as execve() that would launch the | new process suspended, then we could use... I dunno, even | fcntl() on the process's descriptor to configure all those | things and then send it SIGCONT when we're done setting it | up. | kccqzy wrote: | When you are a fresh student, and when fork/exec is too | advanced for you, just use system(3). Not recommended for | production use, but that's indeed the easiest way to launch | an executable. | | Do we optimize our system call design for fresh students or | for professionals? | Joker_vD wrote: | We were talking about the presentation of "fundamentals" | in the courses on Computer Science, right? The fork/exec | design is highly non-trivial and so should probably | deserve more discussion, with examples of alternative | designs and design considerations, than mere "that's how | new processes are created, nothing special here, let's | move on". | kps wrote: | If you want to look at it as 'fundamentals', fork only | requires the concept of a task, whereas your top-level | comment's suggestion also requires (a) secondary storage, | (b) organized into files, (c) that can be named. | | (That's not _why_ Unix developed with fork, though.) | Joker_vD wrote: | The (b) and (c) are not really needed: early timesharing | systems managed to launch processes straight from the | punched card decks IIRC. And without (a), what do you | even need the second, identical task for? I guess it | could be used parallelize crunching the numbers, but | that's it; and probably that should perhaps be done from | the system operator layer anyway? Like, "launch 5 copies | of program on this deck but mark one specific slot in | memory different for them, so they would know their ID". | khaledh wrote: | From "Operating Systems: Three Easy Pieces" chapter on "Process | API" (section 5.4 "Why? Motivating The API") [1]: | ... the separation of fork() and exec() is essential in | building a UNIX shell, because it lets the shell run | code after the call to fork() but before the call to | exec(); this code can alter the environment of the about-to-be- | run program, and thus enables a variety of interesting | features to be readily built. ... The | separation of fork() and exec() allows the shell to do a whole | bunch of useful things rather easily. For example: | prompt> wc p3.c > newfile.txt In the example | above, the output of the program wc is redirected into the | output file newfile.txt (the greater-than sign is how | said redirection is indicated). The way the shell | accomplishes this task is quite simple: when the child is | created, before calling exec(), the shell closes standard | output and opens the file newfile.txt. By doing so, any | output from the soon-to-be-running program wc are sent | to the file instead of the screen. | | [1] https://pages.cs.wisc.edu/~remzi/OSTEP/cpu-api.pdf | kingkongjaffa wrote: | In the beginning, bell labs invented the transistor... | tachyonbeam wrote: | I mean there were vacuum tubes and relay machines before that | :) | | IMO relays are a bit easier to understand than transistors | because it's a bit easier to understand how to implement | various logic gates with them with only very basic knowledge of | electronics. | | Also, very satisfying clicking sounds: | https://www.youtube.com/watch?v=JZyFSrNyhy8&t | ggggtez wrote: | This is not computer science. I see nary a reference to data | structures or algorithms, or math of any kind. | | These are all engineering concerns, not science. | globular-toast wrote: | Computer science isn't science. | jll29 wrote: | A program often is a model. | neatze wrote: | hmm, applied science is not engineering ? | ravi-delia wrote: | Applied science is engineering, by and large, but it's | worthwhile to have different words for different things. | 'Engineering' is one, 'Science' is another, even though the | latter is useful to the former. | | Of course, Computer Science is almost no science, most | degrees are some blend of math and engineering. | jll29 wrote: | As with anything that entirely depends who is at work. | Watching Linus Torvalds is going to make you arrive at your | answer, but if you followed Don Knuth, you may have to | reconsider. | mikedilger wrote: | I'm disappointed this didn't start from "sand", quantum | mechanics, silicon doping, electromagnetism, electronics, | transistors, gates and latches, etc. | | Also, Computer Science is a much broader field than what this | website addresses including very little science but actually a | mixture of math (e.g. P vs NP stuff) and engineering (both | hardware and software), this website addressing a subset of the | engineering part. | optimalsolver wrote: | I think these CS books should start with a philosophical look at | what kind of thing computation actually is, rather than diving | straight into UNIX file system structures with barely any | context. | | But I guess that would be computer science from the top down. | maweki wrote: | I am a theorist and I would say, that the bottom - as it is the | foundation - is mathematical logic. And you start from there. | | Baffingly enough: All this theory is easily understood with | 7th-grade math. | | Approaching from some specific piece of engineering is surely | top-down. | akersten wrote: | Yeah, the first page of this book being "Everything is a File | in Unix" is already so high-level and far above what I would | call the "bottom-up" of computer science that this book should | be renamed "Unix from the bottom-up." | | The "bottom" of computer science would be mathematically | explaining what it means to compute, and building up to the | abstraction of a Turing machine, and eventually getting to a | concrete implementation of what we call programming and code. | This book has code on page 2 - but code is nowhere near the | bottom. | hctaw wrote: | Good book on that is _Introduction to the Theory of | Computation_ by Michael Sipser. | | One could argue it's from the _very_ bottom up, not much | practicum lies in that text. | zaksingh wrote: | This book was one of the best parts of my CS degree. It's the | only textbook which I read cover-to-cover because the writing | and narrative were so cohesive. Highly recommend. | pc9 wrote: | Does anyone have a comparison between this and Computer Systems: | A Programmer's Perspective? I was planning to read that book and | this looks like a similar resource. | qntty wrote: | CSAPP is like the theory behind low-level (C/C++/etc) | programming. This is more like a sketch of the concepts behind | operating systems. CSAPP is also much longer and more | technical. Where they overlap CSAPP is going to be much more | complete. | mariodiana wrote: | I can't be the only one, but I feel pretty sure I'm in a minority | among computer programmers. This "bottom-up" approach is just not | the way my brain works. I definitely feel I'm a top-down sort of | thinker and learner, and this puts me at a disadvantage, | regarding the literature and approach in this profession. | | Are the terms describing the two approaches here "empirical" | versus "rational"? I'm talking about building bit by bit as | opposed to getting an overview and then drilling down. I | definitely feel like I benefit from the second. And I almost feel | like it may not be too strong to suggest that I'm actively harmed | by the first. | | Something like the author's approach is _great,_ for my purposes, | when I already pretty much understand something. But I can 't | _start_ here. | | Again, I feel like I'm in a minority. But anyone else have this | turn of mind, too? How do you cope? Do you just look for other | materials? | rambambram wrote: | You put into words what I've been feeling for a long time. So I | guess you're not alone. I 'cope' by just turning it around: | finding some meaningful project, decide what I want it to be, | and then find out one step at a time how to get closer. That | starts with me finding out - and then being comfortable enough | with - a big overview of the project. Somehow I keep coming | back at the 'work is like a hill' metaphor from Ryan Singer in | his book Shape Up (from basecamp). | https://basecamp.com/shapeup/3.4-chapter-13#work-is-like-a-h... | AlanSE wrote: | I think it would be good to re-package this information, but | for me, the critical thing that stuff like this accomplishes is | comprehensiveness. You have to go through so many classes at | school just to have all the knowledge base covered at all. | Everything is compartmentalized by default, so that's kind of | the mother of all bottom-up approaches. | | Top-down takes time and reflection, and re-packaging. | Jtsummers wrote: | Past discussions with comments (including the author addressing | the naming issue): | | https://news.ycombinator.com/item?id=13249675 (2016, 157 | comments) | | https://news.ycombinator.com/item?id=21903007 (2019, 77 comments) | dang wrote: | Also | | _Computer Science from the Bottom Up_ - | https://news.ycombinator.com/item?id=7611093 - April 2014 (44 | comments) | brundolf wrote: | This seems highly focused on systems; I wouldn't call it | representative of "Computer Science" as a whole. Only maybe 2-3 | of the classes in my computer science degree were focused on this | kind of stuff. | | Still seems like a good resource, just perhaps mislabeled. | macintux wrote: | I'm surprised you had that many. My CS program had nothing | comparable that I can remember. | | Definitely not computer science. | brundolf wrote: | We had one called "Systems Programming" that was a whirlwind | intro to everything Unix (6 weeks on bash, 6 weeks on C, 6 | weeks on Perl, a mishmash of SSH and everything-is-a-file and | Unix-philosophy, etc) | | We had another called "Operating Systems" that focused on | things like concurrency, building a really basic memory | pager, etc (mix of C and Java I think) | | And we had one other called "Networking" that was sort of | along the same lines as "Operating Systems" but focused on, | well, networking. Mostly Java; we wrote our own application- | layer protocols, that sort of thing | | But yeah pretty much everything else - including the more | "concrete" programming courses like Data Structures and | Algorithms - was fairly abstracted away from any specific | system | macintux wrote: | Nice, and now that I think of it we did have an operating | systems class. I know better than to trust 25-year-old | memories. | michaelcampbell wrote: | My undergrad CS degree I don't think contained any sort | of OS course, other than maybe a survey of them but there | was a graduate level class that was required for any | higher degree. I took it before I graduated with an | undergrad because it was interesting, and it was quite | extensive as I recall. That was well over 20 years ago, | however. | ivankolev wrote: | That sounds a lot like part of the UofT CS program. | elcapitan wrote: | Yeah it's basically an Operating Systems book like Tanenbaum's | "Modern Operating Systems". | zerkten wrote: | > This seems highly focused on systems; I wouldn't call it | representative of "Computer Science" as a whole. | | I'd agree with this, but it fits with how a certain set of | schools teaches their core. They will teach a set of related | classes around the topics here: computer architecture, | operating systems, and networking, because this is often | central to their research interests. Others will scatter this | information across a number of unrelated courses to check the | box, or skip it entirely because their research doesn't touch | as heavily on these topics. | | Further research on the author suggests they are from New | Zealand. I'm from the UK and live in the US. What I see taught | in US CS degrees is somewhat different from what I saw in the | UK, and heard about from friends in continental Europe. I | imagine the author's experience is similar. | | A further bias from the author, and I'm making big assumptions, | is that see value here because they've worked in places that do | lot of system programming. Mentions of VMware and Red Hat on | https://www.wienand.org. | userbinator wrote: | Yes, the same comment appears whenever this comes up on HN. | Bottom-up would actually be something more like Nand2Tetris or | Petzold's _code_. | _wldu wrote: | Years ago, systems was the general purpose CS degree. I guess | it is/was classical CS. Today, there are many specialties | (data, security, algorithms, etc.), but many top-ranked CS | schools still offer a traditional systems degree. | Yottaqubyter wrote: | The reason for the name seems to be because of its focus on | systems. I do agree that the way it's labeled isn't the best, | but it's not trying to be representative of all Computer | Scionce. | | In fact, it seems to specifically be trying to have a different | focus than a cs degree: | | > This book aims to move in completely the opposite direction, | working from operating systems fundamentals through to how | those applications are complied and executed. | brundolf wrote: | "opposite direction" tends to imply "the same content, just | with a different order/framing". Whereas the OP leaves out | the majority (not all, but more than half) of what people | tend to put under the term "computer science" completely | ChrisArchitect wrote: | anything new added? | | Some earlier discussion: | | 1 year ago https://news.ycombinator.com/item?id=21903007 | | 4 years ago https://news.ycombinator.com/item?id=13249675 | | 7 years ago https://news.ycombinator.com/item?id=7611093 | wildmanx wrote: | As much as I appreciate such material, but this is not "computer | science". Maybe "computer engineering", or some mix of "software | engineering", "systems engineering", those things. | | Computer science is about algorithms, complexity, possibly | foundations of programming languages, cryptography, database | theory, those things. Not about Unix or file formats. | | Those are important too, but not "computer science". | wrnr wrote: | The ELF binary format, really? This is like a course advertising | itself as an introduction to normative ethics and proceeding to | lecture people on canon law in the 14th century in Flanders. Like | sure if you into obscure shit this might be interesting but dunno | what one has to do with the other. | Joker_vD wrote: | Well, more like the common law in the 21st century in the USA. | Quite useful in nowadays practice, but obviously neither | universal nor eternal. But could be a good starting point as a | running example, or a reference point when comparing to the | "civil law" systems. | megous wrote: | Not sure what's obscure about ELF. Almost every one of the | programs I use is wrapped in ELF format, except for maybe a | bootloader and scripts. | | If you want to learn from bottom up, then learning what | programs are seems like an obvious approach. | snicker7 wrote: | Looks more like computer engineering than computer science. | MithrilTuxedo wrote: | Yeah, that kinda bothered me too. | | >Computer Science is no more about computers than astronomy is | about telescopes. | xbar wrote: | Similar but different from | https://www.bartlettpublishing.com/site/product/programming-... | beckman466 wrote: | Just read page one and two and I already LOVE the conversational | style of writing from the author. Who the hell decided we should | write textbooks in such a dry and formal way (this has been my | experience). Maybe the textbooks fail me because the author is | not able to translate their excitement into text (I have ADD, so | there's that too)? Of course it's also due the nature of the | copyright/IP system, together with the hierarchies inherent to | capitalism. The competition between workers selling their labor | works to the advantage of employers/capitalists. | | Anyway, learning is basically just a long conversation where we | learn about the feedback loops about whatever concept or system | we are trying to understand and interact with. For me this | author's style (and others like him) works a million times better | to do that in an intuitive way. ___________________________________________________________________ (page generated 2021-04-29 23:01 UTC)