RISKS-LIST: RISKS-FORUM Digest  Sunday, 13 December 1987  Volume 5 : Issue 73

        FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS 
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents:
  Australian datacom blackout (Barry Nelson)
  Finally, a primary source on Mariner 1 (John Gilmore, Doug Mink, Marty Moore)
  Re: Computer-controlled train runs red light (Nancy Leveson)
  Re: interconnected ATM networks (John R. Levine, Darren New)
  Control-tower fires (dvk)
  Loss-of-orbiter (Dani Eder)
  Re: EEC Product Liability (John Gilmore)
  The Presidential "Football"... (Carl Schlachte)
  Radar's Growing Vulnerability (Jon Eric Strayer)

The RISKS Forum is moderated.  Contributions should be relevant, sound, in good
taste, objective, coherent, concise, nonrepetitious.  Diversity is welcome. 
Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM.
For Vol i issue j, FTP SRI.COM, CD STRIPE:<RISKS>, GET RISKS-i.j.
Volume summaries for each i in max j: (i,j) = (1,46),(2,57),(3,92),(4,97).

----------------------------------------------------------------------

Date: Tue, 8 Dec 87 16:11:43 EST
From: Barry Nelson <bnelson@ccb.bbn.com>
Subject: Australian datacom blackout
To: risks@csl.sri.com
Cc: telecom@xx.lcs.mit.edu

From The Australian, 23 November 1987, Sydney, Australia, Page 1, 2nd edition.
[without permission]

8-Column BANNER: SABOTEUR TRIED TO BLACK OUT AUSTRALIA

The heart of Sydney's business district remains in chaos after a dangerously
well-informed saboteur wreaked havoc on the city's fragile telecommunications
system in an attack intended to destroy [Australian] Telecom's operations
nationwide. [An estimated 2000 central city services remain out this morning]

Investigators described the sinister saboteur as a lone, former Telecom
employee with expert knowledge of the underground cables network.

[...] But Telecom said it could have been much worse.  [only Sydney was hit]
but all international services are routed through Sydney [...] 

[The attacker entered the underground tunnels and] severed 24 of the 600 heavy
cables in 10 carefully selected locations.  The bizarre attack knocked out
35,000 telephone lines in 40 Sydney suburbs and brought dozens of [ATMs, POS,
stores, telex, facsimile and betting office] services to a standstill.  [...]

Hundreds of computers broke down, leaving communications and computer
specialists to ponder the real possibility  of vital information being erased
from tapes in banking, insurance and other industries. 

[The largest banks and the international and local PTT offices were all cut
off.  Speculation is that the attacker's information was over two years old
because the same attack at that time would have completely crippled Telecom
Australia. Security locks have now been put on the manhole covers. Just the
reconnection effort is estimated to cost millions of dollars and full damages
will not be known until businesses have time to detect losses.  A man seen
leaving a manhole on Wednesday night was possibly the saboteur reconnoitering
his targets. ...]

Page 2, 4-columns, 5x7 foto - SABOTAGE IS A NIGHTMARE FOR TELECOM'S WEARY BAND

Four hundred Telecom managers, technicians and linesmen worked frantically
toward today's 9am deadline [to restore the damaged services. Some worked 48
hours straight with only brief napping.]

When the enormity of the sabotage was realised (sic) on Friday, a team of
technicians and linesmen was sent into the tunnels to discover the damage.  The
cuts, which were only a centimeter across, could only be found by touch in the
dark, dank tunnels.

"The workmen had to run their hands along the entire length of the cables until
all the cuts were discovered. Some of them walked over 20 miles on Friday
night", said Roger Bamber, the [New South Wales] Telecom Operations Manager
[The system contains 27 km of tunnels. It is estimated that the damage could
have been done by one well-prepared man over a period of less than one hour.]


Things started to go wrong in the city about 7pm on Friday, and workmen
searched through the night until 6am to find all the damage.  [Other searches
were launched over half the state for bombs or other evidence of sabotage.]

[  ... in other articles ]

[Employees' anger at turncoat Telecom policies suggest an insider hacked the
cables.  The telephone workers' union objects to deregulation which has
resulted in years of acrimonious debate.  Last week's Telecom statements
suggested that an independent regulator will be created.  The union doesn't
approve of this action and prefers monopoly.]

-----One must wonder if the REAL crime was obscured by the Telecom outage-----

------------------------------

Date: Sun, 13 Dec 87 05:30:10 PST
From: hoptoad.UUCP!gnu@cgl.ucsf.edu (John Gilmore)
To: RISKS@KL.SRI.COM
Subject: Finally, a primary source on Mariner 1

My friend Ted Flinn at NASA (flinn@toad.com) dug up this reference
to the Mariner 1 disaster, in a NASA publication SP-480, "Far Travelers --
The Exploring Machines", by Oran W. Nicks, NASA, 1985.  "For sale by the
Superintendent of Documents, US Government Printing Office, Wash DC."
Nicks was Director of Lunar and Planetary Programs for NASA at the time.
The first chapter, entitled "For Want of a Hyphen", explains:

"We had witnessed the first launch from Cape Canaveral of a spacecraft
that was directed toward another planet.  The target was Venus, and the
spacecraft blown up by a range safety officer was Mariner 1, fated to
ride aboard an Atlas/Agena that wobbled astray, potentially endangering
shipping lanes and human lives."

..."A short time later there was a briefing for reporters; all that
could be said -- all that was definitely known -- was that the launch
vehicle had strayed from its course for an unknown reason and had been
blown up by a range safety officer doing his prescribed duty."

"Engineers who analyzed the telemetry records soon discovered that two
separate faults had interacted fatally to do in our friend that
disheartening night.  The guidance antenna on the Atlas performed
poorly, below specifications.  When the signal received by the rocket
became weak and noisy, the rocket lost its lock on the ground guidance
signal that supplied steering commands.  The possibility had been
foreseen; in the event that radio guidance was lost the internal
guidance computer was supposed to reject the spurious signals from the
faulty antenna and proceed on its stored program, which would probably
have resulted in a successful launch.  However, at this point a second
fault took effect.  Somehow a hyphen had been dropped from the guidance
program loaded aboard the computer, allowing the flawed signals to
command the rocket to veer left and nose down.  The hyphen had been
missing on previous successful flights of the Atlas, but that portion of
the equation had not been needed since there was no radio guidance
failure.  Suffice it to say, the first U.S. attempt at interplanetary
flight failed for want of a hyphen."

------------------------------

Date: Tue, 8 Dec 87 11:42:36 EST
From: mink%cfa@harvard.harvard.edu (Doug Mink)
To: risks@csl.sri.com
Subject: Mariner 1 from NASA reports

JPL's Mariner Venus Final Project Report (NASA SP-59, 1965)
gives a chronology of the final minutes of Mariner 1 on page 87:

4:21.23	Liftoff
4:25	Unscheduled yaw-lift maneuver
	"...steering commands were being supplied, but faulty application
	of the guidance equations was taking the vehicle far off course."
4:26:16	Vehicle destroyed by range safety officer 6 seconds before
	separation of Atlas and Agena would have made this impossible.

In this report, there is no detail of exactly what went wrong, but "faulty
application of the guidance equations" definitely points to computer error.
"Astronautical and Aeronautical Events of 1962," is a report of NASA to the
House Committee on Science and Astronautics made on June 12, 1963.  It
contains a chronological list of all events related to NASA's areas of
interest.  On page 131, in the entry for July 27, 1962, it states:

	NASA-JPL-USAF Mariner R-1 Post-Flight Review Board determined that
	the omission of a hyphen in coded computer instructions transmitted
	incorrect guidance signals to Mariner spacecraft boosted by two-stage
	Atlas-Agena from Cape Canaveral on July 21.  Omission of hyphen in
	data editing caused computer to swing automatically into a series of
	unnecessary course correction signals which threw spacecraft off
	course so that it had to be destroyed.

So it was a hyphen, after all.  The review board report was followed by a
Congressional hearing on July 31, 1962 (ibid., p.133):

	In testimony befre House Science and Astronautics Committee, Richard
	B. Morrison, NASA's Launch Vehicles Director, testified that an error
	in computer equations for Venus probe launch of Mariner R-1 space-
	craft on July 21 led to its destruction when it veered off course.

Note that an internal review was called AND reached a conclusion SIX DAYS
after the mission was terminated.  I haven't had time to look up Morrison's
testimony in the Congressional Record, but I would expect more detail
there.  The speed with which an interagency group could be put together
to solve the problem so a second launch could be made before the 45-day
window expired and the lack of speed with which more recent problems
(not just the Challenger, but the Titan, Atlas, and Ariane problems
of 1986 says something about 1) how risks were accepted in the 60's,
2) growth in complexity of space-bound hardware and software, and/or
3) growth of the bureaucracy, each member of which is trying to avoid
taking the blame.  It may be that the person who made the keypunch
error (the hyphen for minus theory sounds reasonable) was fired, but
the summary reports I found indicated that the spacecraft loss was
accepted as part of the cost of space exploration.

Doug Mink, Harvard-Smithsonian Center for Astrophysics, Cambridge, MA
Internet:  mink@cfa.harvard.edu
UUCP:	   {ihnp4|seismo}!harvard!cfa!mink

------------------------------

Date: 11 Dec 87 16:54:00 EST
From: "Marty Moore" <mooremj@aim.rutgers.edu>
Subject: Mariner I
To: "risks" <risks@csl.sri.com>

I've just caught up on two months of back RISKS issues.  I have the
following to contribute on Mariner I, based on my time at the Cape:

1.  Mariner I was before my time, but I was told the story by a mathematician
who had been at the Cape since 1960.  According to him, an algorithm, written
as mathematical formulae, involved a Boolean entity R.  At the point of
failure, the mathematician had written NOT-R, that is, "R" with a bar above
the character; however, the programmer implementing the algorithm overlooked
the bar, and so used R when he should have used NOT-R.  This explanation could
subesequently have been interpreted as "missing hyphen", "missing NOT", or
"data entry problem", all of which we've seen in recent contributions. 

2.  I think the FORTRAN version of the story is very unlikely.  Remember that
the error occurred in a critical on-board computer.  I consider it extremely
unlikely that such a computer would have been programmed in FORTRAN in 1962,
considering that the first use I saw of FORTRAN in a ground-based critical
system at the Cape was not until 1978!  (Of course, I wasn't aware of *every*
computer in use, so there may have been an earlier use of FORTRAN, but I'd be
surprised if it was more than a few years earlier.)  It is possible that the
originator of the FORTRAN version of the story may have been aware of another
error caused by the period/comma substitution, and also aware of the Mariner
problem as a "single character" error, and incorrectly associated the two. 

         [There were other messages (e.g., from Eric Roberts, Eugene Miya, 
         and Jim Valerio) on this subject as well, but there is too much 
         redundancy or lack of definitude to include them all...  PGN]

------------------------------

To: risks@csl.sri.com
Subject: Re: Computer-controlled train runs red light
Date: Sat, 12 Dec 87 20:31:44 -0800
From: Nancy Leveson <nancy@commerce.UCI.EDU>

In Risks 5.69, Steve Nuchia writes:

   >Surely these engineers can't be so paranoid as to think that an exact
   >duplication of their (primarily digital) relay-based control system in
   >software would be hard to verify.  It should at least be possible to build a
   >software implementation that could be easily shown to be equivalent to the
   >relays, leaving aside the problem of validating an arbitrary "spagetti code"
   >implementation.

The failure modes of mechanical systems are usually well understood and
very limited in number.  Therefore, system safety engineers are able to
build in interlocks and other safety devices to control these hazards.
The failure modes of software are much more complex and less is known
about how to control software hazards.  Even if the same functionality
is implemented in the software, that does not mean that the failure modes
and mechanisms are identical nor that the complexity of the two systems is
equivalent.  Software also exhibits discontinuities not usually found in 
mechanical relay systems.  

If identical function is implemented in software, then the probability of 
requirements errors in the software is equivalent to design errors in the 
mechanical system.  But there is an additional possibility of introducing 
implementation errors in the software.  Given identical function of both 
types of systems (and thus identical probability of accidents arising from 
problems in this functional design), then the additional probability of design 
and coding errors in the software is not necessarily identical to the 
probability of random "wearout" failures in the mechanical system (the 
primary cause of failures in mechanical systems).

   >Automobile traffic light control boxes, based on relay technology quite
   >similar to that used in railroads, fail every so often due to ants building
   >mounds in the nice warm cabinets.  People have been killed by this bug in a
   >relay system, yet it fails to generate the kind of emotional response that
   >software bugs do.  ---

Certainly there are accidents in conventional mechanical systems.  However,
the concern about software bugs is more than just an irrational emotional
response.  There are very good scientific reasons for it.  Besides that
noted above (greater understanding of failure modes and mechanisms in
mechanical systems and thus better methods to control hazards), it is 
also possible to perform risk assessment on mechanical systems due to 
reuse of standard components with historical failure probability data.  
This is not possible for software.  Certainly these risk figures are 
not always accurate, but it is not irrational to feel more comfortable 
about a system with a calculated risk of an accident of 10^-9 over 10 years 
time than a system with a calculated risk of "?".  

Besides, I question whether accidents caused by mechanical failures generate
less emotional response than accidents caused by software bugs.  Consider
Challenger and Three Mile Island.  It is natural for computer scientists to
have considerable interest in computer-related accidents and reasonable for
non-computer scientists to be worried about software bugs.
                                                            Nancy Leveson, UCI

------------------------------

Date: Tue, 8 Dec 87 22:06:27 EST
From: johnl@ima.ISC.COM (John R. Levine)
To: risks@csl.sri.com
Subject: Re: interconnected ATM networks

The story about BayBanks vs. Bank of Boston ATM cards is even more interesting
than it initially sounds.  BayBanks and Bank of Boston are arch-rivals in
consumer banking, and they run the two largest ATM networks in the region,
XPress 24 and Yankee 24, respectively.  (Yankee 24 is a consortium, but Bank of
Boston is by far the largest participating bank.)  When Yankee 24 was expanded
from its Connecticut base to cover all of New England, XPress 24 was invited to
join, but they declined and BayBanks has since filed an anti-trust suit against
Yankee 24, so far to no effect.

A few years ago, the two banks jointly set up a system of retail store cash
dispensers called Money Supply.  Both XPress 24 cards and Monec cards (Bank of
Boston's previous network, now folded into Yankee 24) work at Money Supply
machines.  One day shortly after Money Supply came up, while waiting for a
plane at Logan Airport in Boston, I noticed that one of the BayBank XPress 24
machines had a small Money Supply sticker on it, and upon trying my Bank of
Boston card, was surprised to discover that it worked.  Subsequent
experimentation showed that other than the four airport BayBank machines,
neither bank's machines accepted the other's cards, and the XPress 24 machines
gave a peculiar message that "your bank has restricted use of this card at this
terminal."  The fact that Bank of Boston cards worked at the airport was not
widely known, even at the two banks.

Thus I was as surprised as anybody to discover that when both banks joined
NYCE, they started taking each other's cards, since I was under the impression
that BayBanks' network already routed Bank of Boston requests via other paths
which were usually blocked, and vice versa.

This suggests that perhaps BayBanks doesn't entirely understand how their ATM
network routes messages to off-network banks.  If I were they, I'd be pretty
nervous.

John Levine, johnl@ima.isc.com or ima!johnl or Levine@YALE.something

------------------------------

To: RISKS@kl.sri.com
Subject: Re: ATM PIN numbers
Date: Sun, 29 Nov 87 21:40:17 -0500
From: new@UDEL.EDU

For what it is worth, the PINs for Mellon cards are not stored on the cards.
I had both a checking and a savings account at Mellon.  Several years after
opening them, I closed the checking account but retained the savings
account. All of a sudden, the card no longer worked.  I visited a branch
office in person to find out what happened.  It seems that when the checking
account closed, the first digit of the PIN number changed. The clerk implied
that I had simply forgotten what the number was, but this was not the case;
I had been using the number for years. I suspect that the data entry person
who closed the account bumped the wrong key on the screen form, accidently
changing the PIN field. I never followed it further. However, since the card
was never out of my possession, I know that the PIN is not on the card.

With regard to Otto Makela's "Your bank's computer is down" message
appearing after entering the PIN: I suspect that all of your information is
gathered before any connection to your bank is attempted. This prevents
tying up the lines during "think time". I think the X.25 standards even
include a special kind of "open connection" packet, whereby an encrypted
batch of data is sent off and a yes/no reply comes back without any true
"connection" ever being established. Of course, this does not invalidate any
of his points, nor does it imply that other countries or banks follow the
same protocols as Mellon Bank, USA.
                                              Darren New
      [For the record, there were somewhat overlapping messages from 
      John McLeod, Robert Stroud, B.J. Herbison, and Peter da Silva.  PGN]

------------------------------

Date: Wed, 9 Dec 87 10:28:01 EST
From: dvk@SEI.CMU.EDU
To: risks@csl.sri.com
Subject: Control-tower fires

Control-tower fire - a nightmare that wasn't...

I was flying out of Cairo airport in 1982 or so, and the night before they
had had a control tower fire.  The immediately visible ramifications of this
were that none of the terminal monitors (the flip chart kind you see in
European train stations) were working, and the gate agents reported delays on
almost every outbound flight (I am not sure about inbound flights - I got to
the airport at 6:45am (for a 10:30am flight) so there was not much inbound).

Cairo International is a fairly busy aiport, yet most of the flights were
departing within an hour of scheduled departure time (i.e., they were "on time"
for Cairo).  The reason for this is that they had ATCs in the burned out
shell of the control tower visually sighting aircraft on the ground (and
possibly in the air), communicating via walkie-talkies to the aircraft and to
ground based directors who literally waved the planes onto the runways.

Basically, everything worked.  Why?  Because the airport was able to shift
into a manual mode of operation when the tower (and computers?) were down.
There were no super failsafes to get in the way.  Now, I am not advocating
the removal of failsafes.  What I am suggesting is that our current failsafes
be made a little less restrictive.  In Chuck Weinstock's post about O'Hare,
the aircraft had trouble getting fuel because of safety interlocks, even when
technicians *knew* there was no danger to the fuel feed.  In Cairo, the whole
system was toasted, but it kept running.  Granted, there are differences, but
there are also lessons to be learned here.  Failsafes whould keep you from
making stupid mistakes, but not prevent you from making intelligent decisions.

------------------------------

Date: Tue, 8 Dec 87 11:39:44 pst
From: ucbcad!ames.UUCP!uw-beaver!ssc-vax!eder@ucbvax.Berkeley.EDU (Dani Eder)
To: uw-beaver!KL.SRI.COM!RISKS
Subject: Loss-of-orbiter (Re: RISKS DIGEST 5.70)

Reliability work done here at Boeing (as part of the Advanced Launch System
program) predicts the loss-of-orbiter rate to be 1 in 60 launches AFTER the
fixes in progress are completed.  The loss of crew rate is somewhat lower,
since there are accidents where you render an Orbiter unuseable, but do not
kill the crew.  For example, landing hard can stress the structure enough that
it would be unsafe to ever fly again, but with no visible damage occurring.

What our reliability work indicates, is that adoption of airplane-like design
rules: such as ability to fly a mission with a single engine failure, all
engines running before launch, double and triple redundant flight control
systems, and powered (jet engine) return to a runway for the booster stage,
should bring the loss-of-payload for a next generation rocket to 1 in 5000
flights.

The lesson we learned from the commercial airplane side of the company is: use
improved technology (such as lighter structural materials and smaller
electronics) to get better reliability rather than a few more pounds of
performance.  Your hardware will last longer, and costs will come down more
that way.
                 Dani Eder/Boeing/Advanced Space Transportation

------------------------------

Date: Sun, 13 Dec 87 05:13:40 PST
From: hoptoad.UUCP!gnu@cgl.ucsf.edu (John Gilmore)
To: RISKS@KL.SRI.COM
Subject: Re: EEC Product Liability

> For imported goods, the original importer into the EEC is liable.

I am curious how long the US->European email/netnews gateway at mcvax will
last after its first suit under this Directive.  Plenty of buggy PD and
redistributable software enters the EEC this way; in fact, it may be
the largest single channel for import of software.

> It is expected that the Act will greatly increase the adoption of software
> Quality Assurance (to conform to ISO standard ISO 9001) and the use of
> mathematically rigorous specification and development methods (VDM, Z etc).

Note that this is posted by someone who makes his living selling such
products (at Praxis).  I would say "caveat emptor" but clearly in Europe
this no longer applies.

It might be fun for someone to sue Praxis for bugs in their product,
especially bugs that result in delivered systems with undiagnosed
failures which later cause suits.

Does Lloyd's of London sell "bug insurance"?

------------------------------

From: hplabs!motsj1!motbos!mcdham!carl@ucbvax.Berkeley.EDU
Date: Sat, 12 Dec 87 10:17:19 PST
Apparently-To: ucbvax!CSL.SRI.COM!RISKS
Subject:  The Presidential "Football"...

I am looking for information related to the "Black Box" that is supposedly
near the President at all times.  This box is reportedly the control center
from which the President can authorize a nuclear launch.  I have heard it
referred to as "The Football".

Can anyone tell me anything about it?  Even folklore is acceptable.  Are
there any texts with this information in it?

Whatever you could let me know would be a help.  I am writing a fictional
account of a Nuclear War and need the inforrmation to complete the work.
Thanks in advance for your help.
                                           Carl Schlachte

            [Folklore may be OK for Carl, but please provide him with
            folklore privately, and keep RISKS messages factual.  PGN]

------------------------------

Date: Thu, 10 Dec 87 15:51:10 EST
From: ndq@h.cc.purdue.edu (Jon Eric Strayer)
To: risks@kl.sri.com
Subject: Radar's Growing Vulnerability

  >From: Peter G. Neumann <NEUMANN@csl.sri.com>
  ... (RISKS readers will recall that the British investigation concluded that 
  the Sheffield's own radars were jammed by a communication back to London that
  was being held at the time.)

While there are anti-radiation missiles, the Exocet that hit the Sheffield was 
not one of them.  I also have serious doubts that the Sheffield's radars were 
"jammed" by a communication transmitter.  I understand that the radars (and
ESM/ECM equipment) were shut off because they jammed the comm equipment.

   [Yes, that was one report.  Sorry I turned it around.  PGN]

------------------------------

End of RISKS-FORUM Digest
************************