tbrcon2020_adc.md - brcon2020_adc - my presentation for brcon2020
 (HTM) git clone git://src.adamsgaard.dk/brcon2020_adc
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) LICENSE
       ---
       tbrcon2020_adc.md (9877B)
       ---
            1 Abstract:
            2 Numerical models are used extensively for simulating complex physical
            3 systems including fluid flows, astronomical events, weather, and
            4 climate.  Many researchers struggle to bring their model developments
            5 from single-computer, interpreted languages to parallel high-performance
            6 computing (HPC) systems.  There are initiatives to make interpreted
            7 languages such as MATLAB, Python, and Julia feasible for HPC
            8 programming.  In this talk I argue that the computational overhead
            9 is far costlier than any potential development time saved.  Instead,
           10 doing model development in C and unix tools from the start minimizes
           11 porting headaches between platforms, reduces energy use on all
           12 systems, and ensures reproducibility of results.
           13 
           14 
           15 ## brcon2020 - 2020-05-02
           16 
           17         title: Energy efficient programming in science
           18 
           19        author: Anders Damsgaard (adc)
           20 
           21       contact: anders@adamsgaard.dk
           22                gopher://adamsgaard.dk
           23                https://adamsgaard.dk
           24 
           25 
           26 ## About me
           27 
           28 * 33 y/o Dane
           29 * #bitreich-en since 2019-12-16
           30 
           31 present:
           32 
           33 * postdoctoral scholar at Stanford University (US)
           34 * lecturer at Aarhus University (DK)
           35 
           36 previous:
           37 
           38 * Danish Environmental Protection Agency (DK)
           39 * Scripps Institution of Oceanography (US)
           40 * National Oceanic and Atmospheric Administration (NOAA, US)
           41 * Princeton University (US)
           42 
           43 #pause
           44 academic interests:
           45 
           46 * ice sheets, glaciers, and climate
           47 * earthquake and landslide physics
           48 * modeling of fluid flows and granular materials
           49 
           50 
           51 ## Numerical modeling
           52 
           53 * numerical models used for simulating complex physical systems
           54 
           55   * n-body simulations: planetary formation, icebergs, soil/rock mechanics
           56 
           57   * fluid flows (CFD): aerodynamics, weather, climate
           58 
           59 
           60 * domains and physical processes split up into small, manageable chunks
           61 
           62 
           63 ## From idea to application
           64 
           65 
           66     1. Construct system of equations
           67 
           68       |
           69       v
           70 
           71     2. Derivation of numerical algorithm
           72 
           73       |
           74       v
           75 
           76     3. Prototype in high-level language
           77 
           78       |
           79       v
           80 
           81     4. Re-implementation in low-level language
           82 
           83 
           84 ## From idea to application
           85 
           86  ,-----------------------------------------------.
           87  |  1. Construct system of equations             |
           88  |                                               |
           89  |    |                                          |
           90  |    v                                          |          _
           91  |                                               |     ___ | | __
           92  |  2. Derivation of numerical algorithm         |    / _ \| |/ /
           93  |                                               |   | (_) |   <
           94  |    |                                          |    \___/|_|\_\
           95  |    v                                          |
           96  |                                               |
           97  |  3. Prototype in high-level language          |
           98  `-----------------------------------------------'
           99       |                                              _       _
          100       v                                             | | ___ | | __
          101                                                     | |/ _ \| |/ /
          102     4. Re-implementation in low-level language      |_| (_) |   <
          103                                                     (_)\___/|_|\_\
          104 
          105 
          106 ## Numerical modeling
          107 
          108       task: Solve partial differential equations (PDEs) by stepping through time
          109             PDEs: conservation laws; mass, momentum, enthalpy
          110 
          111    example: Heat diffusion through homogenous medium
          112 
          113             ∂T
          114             -- = -k ∇²(T)
          115             ∂t
          116 
          117     domain:
          118 
          119        .---------------------------------------------------------------------.
          120        |                                                                     |
          121        |                                  T                                  |
          122        |                                                                     |
          123        '---------------------------------------------------------------------'
          124 
          125 ## Numerical modeling
          126 
          127     domain: discritize into n=7 cells
          128 
          129        .---------+---------+---------+---------+---------+---------+---------.
          130        |         |         |         |         |         |         |         |
          131        |    T₁   |    T₂   |    T₃   |    T₄   |    T₅   |    T₆   |    T₇   |
          132        |         |         |         |         |         |         |         |
          133        '---------+---------+---------+---------+---------+---------+---------'
          134 
          135 #pause
          136 * Numerical solution with high-level programming:
          137 
          138     MATLAB: sol = pdepe(0, @heat_pde, @heat_initial, @heat_bc, x, t)
          139 
          140     Python: fenics.solve(lhs==rhs, heat_pde, heat_bc)
          141 
          142      Julia: sol = solve(heat_pde, CVODE_BPF(linear_solver=:Diagonal); rel_tol, abs_tol)
          143 
          144         (the above are not entirely equivalent, but you get the point...)
          145 
          146 
          147 ## Numerical solution: Low-level programming
          148 
          149     example BC: outer boundaries constant temperature (T₁ & T₇)
          150 
          151 * computing ∇²(T)
          152 
          153        .---------+---------+---------+---------+---------+---------+---------.
          154        |         |         |         |         |         |         |         |
          155   t    |    T₁   |    T₂   |    T₃   |    T₄   |    T₅   |    T₆   |    T₇   |
          156        |         |         |         |         |         |         |         |
          157        '----|--\-+----|--\-+-/--|--\-+-/--|--\-+-/--|--\-+-/--|----+-/--|----'
          158             |   \     |   \ /   |   \ /   |   \ /   |   \ /   |     /   |  
          159             |    \    |    /    |    /    |    /    |    /    |    /    |   
          160             |     \   |   / \   |   / \   |   / \   |   / \   |   /     |    
          161        .----|----+-\--|--/-+-\--|--/-+-\--|--/-+-\--|--/-+-\--|--/-+----|----.
          162        |         |         |         |         |         |         |         |
          163 t + dt |    T₁   |    T₂   |    T₃   |    T₄   |    T₅   |    T₆   |    T₇   |
          164        |         |         |         |         |         |         |         |
          165        '---------+---------+---------+---------+---------+---------+---------'
          166        |<- dx  ->|
          167 
          168 
          169 ## Numerical solution: Low-level programming
          170 
          171 * explicit solution with central finite differences:
          172 
          173         for (t=0.0; t<t_end; t+=dt) {
          174             for (i=1; i<n-1; i++)
          175                 T_new[i] = T[i] - k*(T[i+1] - 2.0*T[i] + T[i-1])/(dx*dx) * dt;
          176             tmp = T;
          177             T = T_new;
          178             T_new = tmp;
          179         }
          180 #pause
          181 
          182 * implicit, iterative solution with central finite differences:
          183 
          184         for (t=0.0; t<t_end; t+=dt) {
          185             do {
          186                 for (i=1; i<n-1; i++) {
          187                     T_new[i] = T[i] - k*(T[i+1] - 2.0*T[i] + T[i-1])/(dx*dx) * dt;
          188                 r_norm_max = 0.0;
          189                 for (i=1; i<n-1; i++)
          190                     if (fabs((T_new[i] - T[i])/T[i]) > r_norm_max)
          191                         r_norm_max = fabs((T_new[i] - T[i])/T[i]);
          192                 tmp = T;
          193                 T = T_new;
          194                 T_new = tmp;
          195             } while (r_norm_max < RTOL);
          196         }
          197 
          198 
          199 ## HPC platforms
          200 
          201 * Stagnation of CPU clock frequency
          202 
          203 * Performance through massively parallel deployment (MPI, GPGPU)
          204 
          205     * NOAA/DOE NCRC Gaea cluster
          206         * 2x Cray XC40, "Cray Linux Environment"
          207         * 4160 nodes, each 32 to 36 cores, 64 GB memory
          208         * infiniband
          209         * total: 200 TB memory, 32 PB SSD, 5.25 petaflops (peak)
          210 
          211 ## A (non-)solution
          212 
          213 * high-level, interpreted code with extensive solver library -> low-level, compiled, parallel code
          214 
          215 * suggested workaround: port interpreted high-level languages to HPC platforms
          216 
          217 #pause
          218 
          219 NO!
          220 
          221 * high computational overhead 
          222 * many machines
          223 * reduced performance and energy efficiency
          224 
          225 
          226 ## A better way
          227 
          228     1. Construct system of equations
          229 
          230       |
          231       v
          232 
          233     2. Derivation of numerical algorithm
          234 
          235       |
          236       v
          237 
          238     3. Prototype in low-level language
          239 
          240       |
          241       v
          242 
          243     4. Add parallelization for HPC
          244 
          245 
          246 ## Example: Ice-sheet flow with sediment/fluid modeling
          247 
          248 
          249     --------------------------._____                   ATMOSPHERE
          250                 ----->              ```--..
          251         ICE                                 `-._________________      __
          252                 ----->                             ------>      |vvvv|  |vvv
          253                                                _________________|    |__|
          254                 ----->                      ,'
          255                                           ,'    <><      OCEAN
          256                 ---->                    /                        ><>
          257     ____________________________________/___________________________________
          258       SEDIMENT  -->
          259     ________________________________________________________________________
          260 
          261 * example: granular dynamics and fluid flow simulation for glacier flow
          262 
          263 * 90% of Antarctic ice sheet mass driven by ice flow over sediment
          264 
          265 * need to understand ice-basal sliding in order to project sea-level rise
          266 
          267 
          268 ## Algorithm matters
          269 
          270                 sphere: git://src.adamsgaard.dk/sphere
          271                         C++, Nvidia C, cmake, Python, Paraview
          272                         massively parallel, GPGPU
          273                         detailed physics
          274                         20,191 LOC
          275 #pause
          276                         3 month computing time on nvidia tesla k40 (2880 cores)
          277 
          278 #pause
          279 * gained understanding of the mechanics (what matters and what doesn't)
          280 * simplify the physics, algorithm, and numerics
          281 
          282 #pause
          283     1d_fd_simple_shear: git://src.adamsgaard.dk/1d_fd_simple_shear
          284                         C99, makefiles, gnuplot
          285                         single threaded
          286                         simple physics
          287                         2,348 LOC
          288 #pause
          289                         real: 0m00.07 s on laptop from 2012
          290 
          291 #pause
          292                         ...guess which one is more portable?
          293 
          294 ## Summary
          295 
          296 for numerical simulation:
          297 
          298 * high-level languages
          299     * easy
          300     * produces results quickly
          301     * does not develop low-level programming skills
          302     * no insight into numerical algorithm
          303     * realistically speaking: no direct way to HPC
          304 
          305 * low-level languages
          306     * require low-level skills
          307     * saves electrical energy
          308     * directly to HPC, just sprinkle some MPI on top
          309 
          310 
          311 ## Thanks
          312 
          313     20h && /names #bitreich-en