[HN Gopher] FaCT: Constant Time Programming Language
       ___________________________________________________________________
        
       FaCT: Constant Time Programming Language
        
       Author : ducktective
       Score  : 92 points
       Date   : 2022-10-23 16:15 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | pierrebai wrote:
       | I do wonder if it is possible to truly be certain code runs in
       | constant-time. The problem is that modern CPU translate their
       | instruction set into internal micro-code. For example, the x86
       | CMOVE instruction (conditional move) could potentially become a
       | branching path once translated to microcode. There are other
       | instructions like multiplication or division that may end-up
       | having different run time depending on input.
        
         | mhh__ wrote:
         | It can be done if you insert data dependencies _everywhere_ and
         | know which CPU it 's running on.
        
         | rdlw wrote:
         | Yes, since every computer has finite memory.
         | 
         | If you bubble sort a list of bytes, say your algorithm runs in
         | 10000+n^2 clock cycles. You have at most an exabyte of storage,
         | so your code will run in at most 10^37 clock cycles! :p
         | 
         | As storage technology improves, of course the coefficient will
         | increase, but at that point is should be considered a different
         | algorithm. This one has undefined behaviour when you have more
         | than an exabyte of storage.
        
           | PeterisP wrote:
           | This article has nothing to do with algorithm complexity and
           | "constant time" big-O complexity class, this is about
           | guarantees of code executing in _absolutely_ constant time
           | (i.e. _exactly_ x clock cycles, not  "at most") in an input-
           | independent manner to ensure that execution time does not
           | reveal any information about what the inputs/outputs were.
           | 
           | In essence, this is about a programming language that can
           | ensure that execution time always equals the worst case; if
           | there's branching, taking both branches and ignoring one of
           | the results; if there's iteration, doing a constant maximum
           | count of iterations no matter what the input data is, etc.
        
         | mysterydip wrote:
         | It should be possible on microcontrollers at least. I don't
         | believe most have a cache.
        
           | AlotOfReading wrote:
           | It's increasingly rare to see a uC without some kind of
           | memory hierarchy nowadays. Even a barebones rpi2040 uses
           | what's effectively an SRAM cache in front of the flash.
           | Unless you have compelling use case that isn't solved by
           | disabling caching (which any chip can do) or region
           | attributes, it's basically a free spec boost for
           | manufacturers.
           | 
           | Regardless, you probably don't want to do your fast-path
           | cryptography on the slowest coprocessors imaginable.
        
             | whatshisface wrote:
             | The RPI2040 is enormous for a microcontroller. A typical
             | microcontroller is something like an Arduino.
        
               | zokier wrote:
               | Not really sure where you are getting your "typical"
               | from, 32bit mcus dominate the market over old 8bitters
               | like avrs in the arduino. And rpi2040 is not particularly
               | enormous as far as ARM microcontrollers go, while it is
               | dual-core the cores are tiny M0+ ones instead of bigger
               | cores like M7 or M33
        
               | AlotOfReading wrote:
               | The Cortex-M0 used in the rpi2040 is also used by
               | Arduino. More broadly, both the SAM and the AVR32
               | families have some level of support for caching. Only the
               | AVR8s don't. I think it's fair to describe 8 bit uCs as
               | "increasingly rare".
        
         | djrenren wrote:
         | Hey, one of the paper authors here, this has been a serious
         | problem in the past (and present), but is definitely improving.
         | For example, intel's docs state that cmove is constant-time on
         | recent hardware. ARM's DIT (data-independent timing) [1]
         | extension provides hardware-level guarantees and intel has
         | extensive docs on how to use their chips for constant-time
         | coding [2]
         | 
         | [1]:
         | https://developer.arm.com/documentation/ddi0601/2020-12/AArc...
         | 
         | [2]:
         | https://www.intel.com/content/www/us/en/developer/articles/t...
        
         | dataflow wrote:
         | > For example, the x86 CMOVE instruction (conditional move)
         | could potentially become a branching path once translated to
         | microcode.
         | 
         | Would you have any links to more reading on this and when it
         | happens? It's the first time I've heard it.
        
         | Someone wrote:
         | If the language spec says a piece of code must execute in
         | constant-time, any deviation is a compiler bug.
         | 
         | But yes, that may mean the language can't be implemented well
         | on lots of modern hardware.
         | 
         | A lot probably can be done by avoiding some instructions,
         | flushing caches very often, clearing branch prediction history
         | (if possible), etc.
         | 
         | That would not make the language win performance shootouts,
         | though.
         | 
         | There also is the problem that not all cores may be equal and
         | that quite a few modern CPUs drop speed 'at will', depending on
         | system load, core temperature, etc.
        
         | mek6800d2 wrote:
         | No, I think. Much like your examples, I have in mind "False
         | Data Dependencies" in the CPU. I came across this on
         | StackOverflow several years ago. It is a long question and a
         | long, informative thread about seemingly irrational performance
         | differences: "https://stackoverflow.com/questions/25078285/repl
         | acing-a-32-..."
        
       | breck wrote:
       | https://pldb.com/languages/fact-lang.html
        
       | ajkjk wrote:
       | There needs to be a law that every programming language README or
       | homepage needs some sample code front-and-center.
        
         | bee_rider wrote:
         | While it would be nice, this is somewhat mitigated by the fact
         | that they've got a top-level directory called "example."
         | 
         | https://github.com/PLSysSec/FaCT/blob/master/example/example...
        
           | ajkjk wrote:
           | true, I noticed that afterwards.
        
       | andsoitis wrote:
       | ... _cryptographic_ programming language...
        
         | hinkley wrote:
         | Avi Shamir is the S in RSA, and his name has appeared on a
         | multitude of papers about side channel attacks. Timing,
         | current, and I think EMI. anything and everything you can
         | measure to differentiate between what happens when you encode
         | message A versus message B, or often worse, the same message
         | with key A versus key B, which lets you reduce the search space
         | for brute force guessing. Knocking n down a bit for a 2^n
         | problem sometimes only reduces a problem from "heat death of
         | the universe" to "nobody is around who could possibly care" but
         | in other cases you can take something from a thousand years to
         | a thousand hours, or all the memory in the US to all the memory
         | in a single cluster.
        
           | ascar wrote:
           | Did you reply to the wrong comment?
        
             | hinkley wrote:
             | >>> Constant Time Programming Language
             | 
             | >> ...cryptographic programming language...
             | 
             | > side channel attacks
             | 
             | I know what I'm about, son.
        
               | samatman wrote:
               | While that's clear from what you said, I also don't
               | exactly see how it works as a reply to GP.
               | 
               | You know the rudeness is against guidelines, as well.
        
               | hinkley wrote:
               | In some cultures, being asked if you belong here is
               | considered passive aggressive, and in some cultures
               | passive aggressive is the worst form of rude.
               | 
               | Tribal knowledge is always a danger. Which part did you
               | guys think I lost the plot? Why code that runs slower is
               | a feature for cryptography? Assuming people know what a
               | side channel attack is? Why the co-creator of one of the
               | longest surviving cipher systems who spent his career
               | breaking other ciphers is relevant background
               | information? Something else?
               | 
               | The summary for the linked package is:
               | 
               | > This is the compiler for the Flexible and Constant Time
               | cryptographic programming language. FaCT is a domain-
               | specific language that aids you in writing constant-time
               | code for cryptographic routines that need to be free from
               | timing side channels.
        
               | cole-k wrote:
               | Are you offended that you were asked if you were replying
               | to the wrong comment? I suppose this could be a cultural
               | faux pas - we must be from different cultures.
               | 
               | I would also be inclined to ask if you meant to reply to
               | another comment. Let me explain. To me the other option
               | is to say that what you wrote makes no sense and I cannot
               | understand why you wrote it. Asking if you meant to reply
               | to another comment is my way of showing that I don't
               | think you're a crank, but instead you just made a silly
               | mistake any one of us could've. But if I wrote it the
               | other way you might think that I think you're a crank.
               | Y'know?
               | 
               | Anyway my understanding of the original comment is that
               | it was meant as a clarification to the title. Hence why
               | your first reply which starts explaining who Avi Shamir
               | is confused me too.
        
               | hinkley wrote:
               | Also related to tribal knowledge: One of the sins of
               | technical writing is forgetting to tell people up front
               | why they should care. The assumption that everyone who
               | got here knows exactly why they are here doesn't scale.
               | You write enough docs and people are gonna click on the
               | wrong one. Nobody wants to read half a page of text and
               | get out a dictionary to find out they're in the wrong
               | space. You've dumped all of short term memory, for one
               | thing.
               | 
               | A constant time programming language sounds like a
               | ridiculous notion to most problem domains. It _is_ a
               | ridiculous notion. You only want code that always runs in
               | exactly half a millisecond when you're solving a very,
               | very specific set of problems, like protecting secrets or
               | opening airbags, and then it's one of the most important
               | things.
        
               | chowells wrote:
               | You don't even need constant time for opening airbags.
               | Real time (upper bound, but no lower bound) will do the
               | job. Pretty much it's just working with secrets that
               | cares about constant time.
        
               | emmelaich wrote:
               | > _One of the sins of technical writing is forgetting to
               | tell people up front why they should care._
               | 
               | Hence our confusion of you mentioning Shamir. I went back
               | to the site thinking he may have been a contributor.
        
         | ogogmad wrote:
         | A language like this one can be used to develop cryptography
         | libraries like OpenSSL.
         | 
         | If two documents have the same length, but differ in the time
         | it takes to encrypt them, then their encryption times leak
         | information about their contents. In cryptography, this is an
         | example of a "side channel" - and side channels are considered
         | bad. A "constant time" language helps minimise the presence of
         | such side channels, making it useful in cryptography.
        
       | throwaway81523 wrote:
       | This is lower tech but might also be interesting, for simple
       | embedded apps:
       | 
       | https://github.com/tomahawkins/atom
        
       | nynx wrote:
       | Wow, that's quite a list of installation instructions.
        
       | insanitybit wrote:
       | Really interesting idea. I would be curious to see if this could
       | be embedded into a proc macro in Rust, allowing one to do
       | something like this:                   fn compare_hashes(a:
       | &[u8], b: &[u8]) -> bool {             let size = a.len();
       | fact!{                 for (uint64 i from 0 to size) {
       | if (x[i] != y[i]) { return -1; }                 }
       | return 0;             }         }
       | 
       | It would get compiled into asm and inlined.
       | 
       | I'm also wondering how this interacts with bounds checking, which
       | would introduce branches. I tried searching the paper (sorry, no
       | time to read the whole thing today) for 'bounds' but did not get
       | a result.
       | 
       | Semi-related. One (insufficient) _option_ for dealing with timing
       | leaks is to add a random delay. ie: `sleep_ms(rand(1,1000))`. The
       | problem is that given N leaks an attacker can remove that noise.
       | I have two questions here:
       | 
       | 1. What is "N"?
       | 
       | 2. Does that sleeping introduce any additional vulnerabilities?
       | Basically, is this a "if you can afford that latency it won't
       | hurt".
        
         | _nhynes wrote:
         | Rust has some great constant time libs already, for instance
         | `subtle` [0]. A `derive(ConstantTimeEq)` might get you most of
         | the way, but a constant-timeifier would be great for wrapping
         | whole algos where you might not want to think too hard about
         | timing side channels.
         | 
         | For your sleeping proposal, it sounds a little like
         | differential privacy [1] where you can add some randomness to
         | gain some privacy but spend some of the predetermined privacy
         | budget. In that case, `N` depends on the (roughly) variance of
         | the computation time, the noise amount, and your privacy
         | budget. If you get it right, it has provable security
         | properties. However, that doesn't work as well when the
         | adversary has access to the machine and can observe the
         | intermediate state (or side channel leaks thereof).
         | 
         | [0]: https://github.com/dalek-cryptography/subtle
         | 
         | [1]: https://blog.openmined.org/privacy-series-basics-
         | definition/
        
         | kqr wrote:
         | If you add the same delay for the same input there's no
         | statistical way to remove the noise because at that point it's
         | no longer noise -- it's bias.
         | 
         | That said, it's fairly easy to tell if a CPU spends time
         | sleeping or doing actual work, so it depends on who is in
         | control of the CPU.
        
         | comex wrote:
         | A better alternative is to sleep, not for a fixed time period
         | after the end of the operation, but until a fixed amount of
         | time after the _start_ of the operation (assuming the operation
         | never takes longer than that). That way, in theory, the total
         | time taken is independent of any variability in the timing of
         | the operation itself.
         | 
         | But there's still a high risk of some side channel letting the
         | attacker tell when the program is working and when it's
         | sleeping.
        
         | jonhohle wrote:
         | For comparisons, the normal "trick" is to & a failure flag for
         | every character in the hashes so you'll always compare the same
         | number of characters. At the end, the failure flag is returned.
         | This may cost more time in obvious failure cases, but
         | eliminates timing oracles (assuming comparison and & operations
         | require the same number of cycles regardless of the operands).
        
       | rurban wrote:
       | error in the paper, chapter 4.1, deferred return:
       | 
       | transformed code, but insecure                   secret mut
       | uint32 rval = 0;         secret mut bool notRet = true;
       | if (sec) { rval = 1; notRet = false; }         if (notRet) {
       | // long-running computation ...         }         return rval;
       | 
       | the long running block depends on the secret, thus leaks it.
        
         | djrenren wrote:
         | Hey, one of the paper authors here. If the compiler stopped
         | here, yes this would be an error, but subsequent
         | transformations (described in section 4.2) eliminate the branch
         | ensuring that timing is equivalent regardless of the value of
         | notRet.
        
       | amelius wrote:
       | How do they control for timing in memory accesses, which can vary
       | greatly depending on caching layers etc.
        
       | c7DJTLrn wrote:
       | Slightly offtopic sorry, but I remember seeing this project years
       | ago for a compiler that emits only mov instructions:
       | 
       | https://github.com/xoreaxeaxeax/movfuscator
       | 
       | Since this is effectively branchless and every instruction would
       | take the same number of micro ops, wouldn't this be a very safe
       | way of writing cryptographic code free of side channels?
        
         | ducktective wrote:
         | There is also a performance question:
         | 
         | https://youtu.be/kbn9UCRK2Qg?t=1228
         | 
         | Judging by the image of generated assembly code in movfuscator
         | repo, I don't think the mov-only solution would be efficient.
        
           | c7DJTLrn wrote:
           | Absolutely, the performance would be terrible, but under
           | circumstances where you need to be sure there's no side
           | channels (like in a secure element kind of chip) maybe that's
           | acceptable.
        
           | insanitybit wrote:
           | If all you're doing is comparing two hash values it's
           | probably a lot less important that it's fast and more
           | important that it's side channel resistant.
        
             | WJW wrote:
             | Being both side channel resistant _and_ fast would be even
             | better though. Movfuscator is a cool project but there are
             | definitely better ways to generate constant time code.
        
         | djrenren wrote:
         | One of the authors of FaCT here. This is a great question,
         | because at first blush it feels like it might. movfuscator
         | creates a branch-less program. But my guess is that we'll run
         | into two key problems:
         | 
         | 1. Leakage via cache. If our memory access patterns are
         | influenced by secret data, then we can detect variance in
         | execution time as a result of cache hits/misses. Movfuscator
         | generates code that does lots of loads and stores using
         | application data as addresses so my guess is that even if your
         | source program didn't depend on secrets in this way, the output
         | code probably still would.
         | 
         | 2. Termination rules. Movfuscator programs run inside a giant
         | loop. Every execution of that loop drives execution forward,
         | and every instruction is executed on every loop. Even _if_ the
         | body of this loop is constant-time (see above for why it's
         | probably not), we need to consider how the program actually
         | terminates. For example if I write the following C code:
         | for (int i = 0; password[i] != 0 && entry[i] != 0; i++) {
         | if (password[i] != entry[i]) return false;         }
         | return true;
         | 
         | We can see that it takes fewer iterations to check entries
         | which are incorrect earlier. For example, if the password is
         | "foo" and the entry is "bar", then we return in the first
         | iteration, as opposed to the entry "fob" which returns on the
         | third. Thus, if the programs termination time is affected by
         | the secret value, could still detect timing variance because
         | the full program would terminate faster even if each loop took
         | the same amount of time.
        
           | c7DJTLrn wrote:
           | This is a great answer, thank you.
        
       | ducktective wrote:
       | His presentation at Strange Loop:
       | 
       | https://www.thestrangeloop.com/2018/fact-a-new-language-for-...
       | 
       | https://youtu.be/kbn9UCRK2Qg
        
       ___________________________________________________________________
       (page generated 2022-10-23 23:00 UTC)